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Abstract 

Underwater  wireless  communication  is  quickly  becoming  a  necessity  for  applications 
in  ocean  science,  defense,  and  homeland  security.  Acoustics  remains  the  only  prac¬ 
tical  means  of  accomplishing  long-range  communication  in  the  ocean.  The  acoustic 
communication  channel  is  fraught  with  difficulties  including  limited  available  band¬ 
width,  long  delay-spread,  time-variability,  and  Doppler  spreading.  These  difficulties 
reduce  the  reliability  of  the  communication  system  and  make  high  data-rate  commu¬ 
nication  challenging.  Adaptive  decision  feedback  equalization  is  a  common  method  to 
compensate  for  distortions  introduced  by  the  underwater  acoustic  channel.  Limited 
work  has  been  done  thus  far  to  introduce  the  physics  of  the  underwater  channel  into 
improving  and  better  understanding  the  operation  of  a  decision  feedback  equalizer. 
This  thesis  examines  how  to  use  physical  models  to  improve  the  reliability  and  reduce 
the  computational  complexity  of  the  decision  feedback  equalizer.  The  specific  topics 
covered  by  this  work  are:  how  to  handle  channel  estimation  errors  for  the  time  varying 
channel,  how  to  use  angular  constraints  imposed  by  the  environment  into  an  array 
receiver,  what  happens  when  there  is  a  mismatch  between  the  true  channel  order  and 
the  estimated  channel  order,  and  why  there  is  a  performance  difference  between  the 
direct  adaptation  and  channel  estimation  based  methods  for  computing  the  equalizer 
coefficients.  For  each  of  these  topics,  algorithms  are  provided  that  help  create  a  more 
robust  equalizer  with  lower  computational  complexity  for  the  underwater  channel. 

Thesis  Supervisor:  James  C.  Preisig 

Title:  Associate  Scientist,  Woods  Hole  Oceanographic  Institution 
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Acronyms 


Acronym 

Definition 

AoA 

Angle  of  Arrival 

BPSK 

Binary  Phase  Shift  Keyed 

CEB 

Channel  Estimate  Based 

DA 

Direct  Adaptation 

DFE 

Decision  Feedback  Equalizer 

EW-RLS 

Exponential- Weighted  Recursive  Least-Squares 

FSK 

Frequency  Shift  Keyed 

IID 

Independent  and  Identically  Distributed 

ISI 

Inter  Symbol  Interference 

MAE 

Minimum  Achievable  Energy 

MCM 

Multiple  Constraint  Method 

MMSE 

Minimum  Mean  Squared  Error 

MSE 

Mean  Squared  Error 

MVDR 

Minimum  Variance  Distorionless  Response 

PSD 

Power  Spectral  Density 

PSK 

Phase  Shift  Keyed 

QAM 

Quadrature  Amplitude  Modulation 

QPSK 

Quadrature  Phase  Shift  Keyed 
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Acronym  Definition 


RF 

Radio  Frequency 

RLS 

Recursive  Least  Squares 
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Soft  Decision  Error 

SINR 

Signal-to-Interference-plus-Noise  Ratio 

SNR 

Signal-to-Noise  Ratio 

SOFAR 

Sound  Fixing  And  Ranging 

wssus 

Wide  Sense  Stationary  Uncorrelated  Scattering 
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Chapter  1 


Introduction 


Since  the  dawn  of  time,  people  have  looked  at  the  surface  of  the  ocean  and  wondered 
what  secrets  might  be  hidden  in  the  depths.  Born  out  of  this  wonder,  oceanography 
is  a  science  devoted  to  understanding  the  mysteries  of  the  seas.  Oceanographers  have 
made  many  important  discoveries  that  have  fundamentally  changed  our  understand¬ 
ing  of  the  world.  Technology  is  a  driving  force  behind  many  of  these  underwater 
discoveries.  One  important  part  of  technology,  and  not  coincidentally  the  focus  of 
this  thesis,  is  wireless  communication.  In  oceanography,  wireless  communication  is 
used  to  increase  portability,  simplify  deployments,  and  decrease  mission  cost. 

Acoustic  radiation  is  currently  the  only  practical  way  to  wirelessly  transmit  in¬ 
formation  underwater  distances  more  than  a  few  hundred  meters.  Wideband  electro¬ 
magnetic  radiation  is  common  in  terrestrial  communications  but  is  highly  attenuated 
after  propagating  short  distances  through  the  ocean:  electromagnetic  radiation  in  the 
megahertz  to  gigahertz  range  (radio  frequency  or  RF  radiation)  propagates  only  a  few 
meters  before  being  attenuated  and  electromagnetic  radiation  in  the  optical  range  (es¬ 
pecially  blue-green  light)  propagates  around  a  hundred  meters.  In  contrast,  acoustic 
radiation  has  relatively  low  attenuation  and  can  propagate  long  distances  through  the 
ocean.  Acoustics  have  been  used  to  signal  through  thousands  of  kilometers  of  water 
[129]  and  have  been  used  in  virtually  every  ocean  environment  [77]. 

The  goal  of  wireless,  acoustic  communication  is  to  transmit  digital  data  reli¬ 
ably  with  minimum  data  rate  and  maximum  power  constraints.  There  are  several 
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challenges  when  communicating  acoustically  through  the  underwater  channel:  inter¬ 
symbol  interference  (ISI)  caused  by  reverberation  [76],  limited  signal  bandwidth  due 
to  frequency  dependent  absorption  [103],  and  time- variability  of  the  channel  [75]. 
Every  ocean  environment  [i.e.  every  communication  setup)  has  unique  operating  pa¬ 
rameters  (depth,  system  geometry,  water  column  chemistry,  etc.)  so  there  is  no  uni¬ 
versal  underwater  acoustic  channel  model  for  system  analysis.  As  a  result,  underwater 
communication  systems  are  often  adaptively  tuned  based  on  in-situ  measurements. 

To  mitigate  channel  induced  signal  distortions,  the  received  signal  is  filtered  in  a 
structure  known  as  an  equalizer.  An  equalizer  produces  an  estimate  of  the  transmitted 
symbol  using  a  weighted  combination  of  the  received  signal  and,  in  some  structures, 
past  symbol  estimates.  The  metric  used  to  gauge  equalizer  performance  is  the  average 
squared  error  between  the  equalizer  output  and  the  transmitted  data  symbol. 

Adaptive  equalizers  were  initially  designed  for  the  wired  telephone  channel  [62,  63] 
using  several  simplifying  assumptions,  such  as  slow  time- variation  and  white  observa¬ 
tion  noise.  These  approximations  do  not  generally  hold  for  the  underwater  acoustic 
channel;  new  thinking  is  needed  to  design  equalizers  that  handle  the  harsh  condi¬ 
tions  of  the  underwater  channel  (e.g.  large  delay-spread,  quickly  varying  coefficients, 
frequency  selective  fading,  etc.)  and  that  are  computationally  simple  enough  to  be 
implemented  on  real-time  systems.  Using  physical  understanding  of  the  underwater 
acoustic  communication  channel  this  thesis  proposes  several  equalizer  improvements 
with  particular  attention  toward  limiting  computational  complexity. 

1.1  Contributions  of  this  thesis 

The  goal  of  this  thesis  is  to  analyze  past  equalizer  design  assumptions  and  propose  new 
algorithms  for  limiting  complexity  and  improving  performance.  Specific  contributions 
toward  this  goal  are: 

1.  A  description  of  how  the  physical  considerations  of  the  communication  channel 
affect  the  structure  of  the  effective  noise  correlation  matrix  used  in  the  compu¬ 
tation  of  the  equalizer  coefficients. 


22 


The  effective  noise  includes  observation  noise,  sensor  noise,  and  noise  from  chan¬ 
nel  estimation  errors.  Traditionally,  the  effective  noise  correlation  matrix  has 
been  approximated  using  a  scaled  identity  matrix.  Chapter  3  shows  that  the 
mean  squared  error  can  be  reduced  by  as  much  as  4  dB  using  a  fully  pop¬ 
ulated  effective  noise  correlation  matrix  and  the  computational  complexity  is 
reduced  by  assuming  a  Toeplitz  matrix  structure  (which  also  further  reduces 
mean  squared  error). 

2.  Analysis  showing  the  best  non-adaptive  combination  of  elements  from  a  multi¬ 
element  receiver  that  reduces  computational  complexity  without  sacrificing  per¬ 
formance. 

In  Chapter  4,  a  set  of  static  beams  is  found  which  reduce  computational  com¬ 
plexity  without  sacrificing  too  much  performance  (at  most  a  decibel  or  two 
degradation  in  performance).  Experimental  data  reveals  that  there  are  some 
channel  conditions,  such  as  calm  seas  with  a  low  signal-to-noise  ratio  where  the 
non-adaptive  beams  outperform  a  fully-adaptive  beamspace  processor.  Data- 
driven  techniques  for  determining  the  appropriate  number  of  beams  are  ana¬ 
lyzed. 

3.  An  analysis  of  how  fixing  the  channel  model  order  affects  the  mean  squared  error 
equalizer  performance. 

A  channel  estimate  based  equalizer  requires  a  fixed  number  of  modeled  channel 
coefficients.  In  Chapter  5  it  is  shown  that  when  the  model  has  a  different 
number  of  coefficients  than  the  true  channel,  equalizer  performance  is  degraded. 
A  method  of  improving  performance  by  adjusting  the  noise  correlation  matrix 
is  detailed. 

4.  A  comparison  of  direct  adaptation  and  channel  estimate  based  equalizer  algo¬ 
rithms,  showing  why  the  channel  estimate  based  has  lower  mean  squared  error 
at  high  SNR. 

At  high  SNR  data  symbol  estimates  from  channel  estimate  based  equalizers 
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have  lower  mean  squared  error  than  estimates  from  direct  adaptation  equaliz¬ 
ers.  Chapter  6  presents  new  analysis  which  explains  this  effect;  (MMSE)  equal¬ 
izer  coefficients  have  a  shorter  correlation  window  than  channel  coefficients, 
so  tracking  the  equalizer  coefficients  (i.e.  DA  equalization)  has  higher  mean 
squared  error  than  tracking  the  channel  coefficients  (channel  estimate  based 
equalization). 


1.2  Related  work 

This  thesis  focuses  on  analysis  of  the  decision  feedback  equalizer  (DFE)  for  underwater 
communication.  There  are  many  references  which  discuss  the  operation  of  the  DFE  in 
a  variety  of  contexts,  such  as  [79,  86].  Monsen  [64]  wrote  a  seminal  paper  examining 
the  effect  of  DFE  equalization  on  a  fading  channel  where  theoretical  lower  performance 
bounds  were  derived.  Qureshi  [81]  wrote  a  nice  tutorial  paper  summarizing  the  work 
on  adaptive  equalization  prior  to  1985. 

The  goal  of  most  equalizers  is  to  reduce  the  squared  error  between  the  data  symbol 
estimate  and  the  true  data  symbol.  There  have  been  several  studies  examining  the 
nature  of  this  error.  Eletheriou  and  Falconer  [31]  examined  how  recursive  least  squares 
(RLS)  tracking  error  affected  DFE  performance.  They  proposed  separating  the  error 
into  the  sum  of  two  parts:  one  term  caused  by  channel  estimation  errors  due  to  time 
variability  and  another  term  caused  by  noise. 

Stojanovic  [106]  proposed  an  alternate  decomposition  of  the  error  term:  first  into 
causal  and  a-causal  parts  and  then  into  a  channel  estimation  error  part  and  a  noise 
part.  She  postulated  if  the  channel  estimation  error  could  be  estimated,  it  could 
be  combined  with  the  observation  noise  estimate  to  create  a  total  (effective)  noise 
estimate.  She  and  Zvonar  extended  this  research  into  multi-user  equalization  in  [111]. 

One  form  of  the  MMSE  equation  for  equalizer  coefficients  is  an  inverse  matrix 
multiplied  by  a  column  vector.  Dzung  [27]  simplified  equalizer  error  analysis  of  adap¬ 
tive  algorithms  by  replacing  the  inverse  of  the  random  matrix  with  the  inverse  of  the 
expectation  of  the  random  matrix. 


24 


Preisig  [75]  examined  how  imperfect  channel  estimation  affected  the  equalizer 
taps.  He  proposed  an  estimated  error  DFE,  where  the  error  covariance  matrix  is 
estimated  using  the  received  signal,  past  data  symbols,  and  a  channel  estimate.  His 
work  showed  definitively  that  the  effective  error  can  be  modelled  as  the  noise  plus  a 
term  which  accounts  for  channel  estimation  errors. 

Building  on  the  work  of  Eleftheriou  and  Falconer  [31],  Nadakuditi  and  Preisig 
[65,  66]  presented  a  more  sophisticated  separation  of  the  channel  estimation  error 
when  using  a  recursive  least  squares  algorithm.  Employing  an  extended  state  space 
model  they  provided  derivations  which  related  the  observed  noise  correlation  matrix  to 
both  the  channel  and  noise  correlation  matrices.  In  the  same  work  [65,  66],  Nadakuditi 
and  Preisig  presented  results  the  effect  of  fixing  channel  model  order  on  channel 
estimation  errors  when  using  a  recursive  least  squares  algorithm. 

Stojanovic  et  al.  pioneered  the  analysis  and  application  of  advanced  equalization 
techniques  for  the  underwater  communication  [80,  105,  107,  108,  110].  Using  ex¬ 
perimental  data,  she  verified  that  equalization  was  possible  underwater.  She  also 
examined  some  environmental  factors  that  affect  communication,  such  as  noise  and 
absorption,  and  derived  useful  approximations  [101,  103]. 

Preisig  et  al.  also  how  ocean  physics  affects  underwater  communication  systems 
in  [74,  75,  76,  78].  They  focused  on  the  effect  of  time- varying  environments  (surface 
waves)  on  communication  systems  and  how  to  compensate  for  environmental  distor¬ 
tions  using  equalization.  One  interesting  observation  was  that  waves  act  as  a  concave 
mirrors  which  focuses  the  acoustic  energy  and  causes  large,  fast  amplitude  changes 
at  the  receiver.  Li  et  al.  [60,  59]  proposed  using  the  delay-Doppler  characterization 
of  the  channel  along  with  sparse  techniques  to  mitigate  this  effect  of  these  mirrors. 

In  a  seminal  work  on  multichannel,  adaptive  equalization  for  underwater  com¬ 
munication,  Stojanovic  et  al.  [105]  found  that  the  optimal  multichannel  combiner 
is  a  matched  filter  followed  by  a  maximum  likelihood  sequence  estimator  (MLSE). 
Since  the  MLSE  is  impractical  due  to  the  large  channel  delay  spread  in  underwater 
environments  (which  can  span  hundreds  of  symbols),  she  used  an  adaptive  DFE  as 
the  channel  combiner.  Using  experimental  data,  she  showed  that  for  the  underwater 
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channel,  the  mean  squared  error  of  the  multichannel  DFE  output  is  not  significantly 
greater  than  the  mean  squared  error  of  the  MLSE  output. 

The  same  set  of  authors  showed  that  when  the  direction  of  arrival  for  all  multipath 
components  is  known  a  multichannel  DFE  with  a  beamformer  is  equivalent  to  a 
multichannel  DFE  without  a  beamformer  [109].  They  further  showed  that  any  set 
of  beam-weights  that  spans  the  signal  space  produces  equivalent  mean  squared  error 
performance  when  the  observation  noise  is  spatially  and  temporally  white.  Using  the 
multichannel  DFE  with  a  beamformer  reduced  the  computational  complexity  of  the 
receiver  when  the  number  of  multipath  arrivals  was  less  than  the  number  of  sensors. 

Using  physics-based  constraints,  the  communication  receiver  can  better  estimate 
the  channel  and  reduce  computational  complexity  by  reducing  the  number  of  param¬ 
eters  to  be  estimated.  Kraay  and  Baggeroer  [50]  proposed  using  physical  constraints 
for  array  processing  by  constraining  the  signal  covariance  matrix  to  be  realizable  when 
the  received  signal  was  a  sum  of  narrowband  plane  waves.  Their  goal  was  to  reduce 
the  number  of  snapshots  needed  to  properly  estimate  a  covariance  matrix. 

Papp  et  al.  [71,  72]  used  a  different  form  of  physical  constraint:  mode-filtering. 
They  showed  that  mode-filtering  improves  array  signal-processing.  They  also  showed 
using  experimental  data  that  mode-filtering  a  signal  before  equalization  had  higher 
mean  squared  error  than  an  equalizer  with  no  mode-filter. 

LeBlanc  and  Beaujean  [55,  56]  proposed  applying  principle  component  analysis 
(PGA)  to  acoustic  communication  systems  with  receive  arrays.  To  improve  equalizer 
performance  the  beams  were  decorrelating  by  using  the  eigenvectors  of  the  received 
signal  correlation  matrix  were  used  as  the  beamformer  weights.  They  focused  mainly 
on  the  decorrelating  effects  of  this  technique  and  not  on  dimensionality  reduction. 

Two  common  methods  for  computing  adaptive  equalizer  coefficients  are  direct 
adaptation  (DA)  where  the  equalizer  coefficients  are  estimated  directly  from  the  re¬ 
ceived  data  and  channel  estimate  based  (CEB)  where  a  channel  estimate  is  used  to 
compute  the  coefficients.  There  have  been  several  studies  comparing  the  DA  equal¬ 
ization  with  CEB  equalization,  but  the  performance  comparisons  contained  only  em¬ 
pirical  evidence  without  analysis.  Many  authors  had  hypotheses  about  the  cause  of 
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the  performance  difference,  but  there  was  no  consistency  between  them. 

An  often  cited  work  comparing  DA  and  CEB  equalizers  on  a  Rayleigh  fading 
channel  is  by  Shukla  et  al.  [91].  The  authors  showed  that  when  the  channel  order 
is  known  and  the  signal  to  noise  ratio  (SNR)  is  large,  the  DA  approach  had  higher 
mean  squared  error  than  the  CEB  approach. 

Fechtel  and  Meyr  [32]  also  demonstrated  a  difference  in  mean  squared  error  (at 
high  SNR)  between  DA  and  CEB  equalizers,  assuming  the  CEB  equalizer  had  perfect 
channel  knowledge.  They  hypothesized  that  the  difference  was  due  to  the  lag  in  the 
DA  equalizer  which  implicitly  has  to  estimate  the  channel  state  information. 

Lee  and  Cox  [58]  examined  the  performance  difference  between  the  DA  and  CEB 
methods  when  the  the  true  channel  order  was  not  known.  They  experimentally 
validated  that  for  an  unknown  channel  length  the  DA  method  outperformed  the 
CEB  method.  They  also  found  that  a  matrix  regularization  term  was  effective  to 
combat  the  difference  in  performance  between  the  two  methods.  In  later  work  [57] 
examined  the  effect  of  channel  mismatch  on  the  bit  error  rate  (BER)  of  a  maximum 
likelihood  sequence  estimator  (MLSE). 

An  alternative  to  equalization  is  time-reversal  [33,  85].  In  time-reversal  techniques, 
a  channel  estimate  is  convolved  with  the  data  signal.  The  channel  is  estimated  by 
sending  a  pulse  through  the  channel  and  recording  the  received  signal.  This  form  of 
channel  estimation  is  not  robust  to  channel  variations.  An  array  is  used  to  either 
transmitted  or  receive  (or  both)  which  provides  an  array  gain  proportional  to  the 
number  of  sensors  in  the  array.  Time-reversal  methods  both  temporally  and  spatially 
match  filter  the  received  signal  to  increase  the  effective  SNR. 

Time-reversal  has  been  shown  to  be  an  effective,  low-complexity  method  for  han¬ 
dling  the  difficulties  of  the  underwater  channel  [46,  47,  85]  and  has  been  extended 
into  multi-user  scenarios  [95,  96,  97].  Results  have  been  confirmed  using  experimental 
data  [28,  40].  After  comparing  the  mean  squared  error  of  time-reversal  with  equal¬ 
ization,  the  equalizer  always  had  lower  mean  squared  error  [128].  An  equalizer  is 
thus  generally  preferred  to  time-reversal.  Attempts  have  been  made  to  include  both 
time- reversal  and  equalization  into  one  communication  system  [17,  18,  98]. 
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There  has  been  increasing  interest  in  combining  equalization  with  error  correcting 
coding,  a  technique  known  as  turbo-equalization  [26,  49,  117,  118].  The  data  is  first 
encoded  using  an  error  correcting  code  and  the  resulting  signal  is  transmitted  through 
a  channel.  The  equalizer  filters  the  received  signal  and,  rather  than  making  a  symbol 
decision,  transmits  the  filtered  output  directly  to  the  decoder.  The  raw  output  of 
the  equalizer  without  a  symbol  decision  is  known  as  soft  information.  A  decoder  uses 
this  soft  information  to  refine  the  transmitted  symbol  probabilities.  The  updated  soft 
information  is  sent  back  to  the  equalizer  and  the  process  iterates. 

Turbo-equalization  has  been  shown  to  work  well  for  underwater  channel  [19,  20, 
25,  92],  One  issue  with  turbo-equalization  is  its  computational  complexity.  There 
has  been  work  done  to  reduce  the  computational  complexity  [93,  117],  but  still  more 
needed.  The  techniques  presented  in  this  thesis  could  be  applied  to  the  equalizer 
portion  of  a  turbo-equalizer  to  improve  performance  and  reduce  computational  com¬ 
plexity  in  underwater  environments. 

1.3  Organization 

The  remainder  of  this  thesis  is  organized  as  follows:  Chapter  2  provides  the  mathemat¬ 
ical  and  conceptual  background  to  understand  the  remainder  of  the  thesis.  Chapter  3 
describes  how  channel  estimation  errors  can  be  accounted  for  when  calculating  equal¬ 
izer  filter  weights.  Chapter  4  explains  how  knowledge  of  the  physically  constrained 
arrival  angles  can  be  incorporated  into  an  array  receiver  to  reduce  computational 
complexity.  Chapter  5  presents  analysis  of  how  assumptions  of  the  channel  order 
affect  the  equalizer  error.  Chapter  6  discusses  the  difference  between  the  CEB  and 
the  DA  equalizers.  Concluding  remarks  and  areas  for  future  research  are  identified  in 
Chapter  7. 


1.4  Notation 


Throughout  this  thesis,  the  following  notation  is  used: 


Symbol  Definition 


a 

a 

A 


T 

H 


I 

I M 
0 

EH 


Non-bold  lowercase  letters  represent  scalar  constants 
Bold  lowercase  letters  represent  column  vectors 
Bold  uppercase  letters  represent  matrices 
Complex  conjugate  of  the  variable  (i.e.  a*) 

Transpose  of  a  vector  or  matrix  (i.e.  A1) 

Hermitian  (conjugate  transpose)  of  a  vector  or  matrix  (i.e.  AH) 
2-norm  of  the  enclosed  quantity  (i.e.  ||a||) 

Estimate  of  a  quantity  (i.e.  a) 

Square  Identity  matrix  (context  sized) 

MxM  Square  Identity  Matrix 
Zero  matrix  or  vector  (context  sized) 

Expectation  of  enclosed  quantity 
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Chapter  2 


Background 

2.1  Underwater  communication 

The  underwater  channel  remains  one  of  the  most  challenging  communication  environ¬ 
ments.  Designing  a  reliable  communication  system  remains  an  active  area  of  research. 
Knowing  where  a  system  is  going  to  be  deployed  is  important  when  designing  under¬ 
water  acoustic  (UWA)  communication  system.  Communicating  vertically  through 
the  ocean  tends  to  be  the  simplest  regime  since  often  there  is  little  multipath.  The 
name  of  this  environment  is  the  Reliable  Acoustic  Path  (RAP)  [24]  due  to  the  fidelity 
of  the  channel.  Baffles  or  directional  hydrophones  are  used  reduce  effects  of  surface 
bounces. 

In  deep  water  systems  there  is  less  interaction  with  the  surface  so  the  communica¬ 
tion  channel  is  time-invariant,  possibly  sparse,  and  widely  spread  in  delay.  Modeling 
techniques  are  often  employed  in  deep  water  channels  to  determine  the  locations  where 
communication  is  possible  due  to  the  channel  physics.  The  direct  path  is  often  the 
last  to  arrive  in  deep  water  since  a  natural  waveguide  exists  around  the  sound-speed 
minimum  in  deep  water  [42], 

In  shallow  water  environments  interactions  with  the  time-varying  ocean  surface 
are  unavoidable.  There  are  also  interactions  with  the  bottom  and  nearby  obstacles 
plus  noise  from  waves,  shipping  traffic,  and  marine  life.  When  operating  in  such  a 
dynamic  environment  adaptive  channel  tracking  and  adaptive  equalization  techniques 
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are  essential,  such  as  the  exponentially  weighted  recursive  least  squares  (EW-RLS) 
algorithm. 

Underwater  communication  began  with  the  uncoded,  analog  underwater  telephone 
or  “the  Gertrude”  which  was  used  to  commzunicate  with  a  manned  submersible.  As 
higher  data-rates  became  necessary  and  the  intended  receiver  was  a  machine  instead  of 
a  person  more  complex  systems  were  needed.  The  development  of  these  systems  began 
in  the  analog  domain,  but  quickly  switched  to  the  digital  domain  where  frequency  shift 
keying  (FSK)  and  more  recently  phase  shift  keying  (PSK)  (for  increased  bandwidth 
efficiency)  are  used  for  data  modulation.  Many  detailed  papers  have  been  written 
concerning  the  history  of  underwater  communications  such  as  [5,  14,  15,  16,  45,  110]. 

The  focus  of  this  thesis  is  on  the  well-mixed,  shallow-water  channel  where  the 
isovelocity  assumption  is  appropriate.  Data  is  modulated  using  phase  shift  keying 
(PSK)  techniques  since  this  is  the  current  state  of  the  art.  The  following  subsections 
outline  some  of  the  difficulties  in  communicating  through  the  underwater  environment 
to  familiarize  the  reader  and  to  emphasize  that  this  is  a  harsh  environment  that 
requires  extra  effort. 

2.1.1  Distance  and  SNR 

The  majority  of  ocean  noise  can  be  separated  into  one  of  four  components:  turbulence, 
shipping,  wind,  and  thermal  noise.  Turbulence  dominates  in  the  low  frequency  region 
under  10  Hz,  shipping  noise  is  dominant  in  the  10-100  Hz  region,  wind-driven  wave 
noise  prevails  in  the  100  Hz  -100  kHz,  and  thermal  noise  dominates  above  100  kHz 
[102],  The  total  noise  is  the  unweighted  sum  of  these  four  noise  components.  A 
useful  approximation  of  the  noise  power  spectral  density  (PSD)  as  a  function  of  the 
frequency  /  in  kHz  is 

10  log10  N(f)  «  Nt  -  7]  log10  /.  (2.1.1) 

The  above  expression  has  units  dB  re  yuPa  per  Hz  and  the  constants  are  N\  =  50  and 
rj  =  18  [102], 

Common  acoustic  communication  frequencies  are  from  100  Hz  to  100  kHz,  so  wind 
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driven  noise  is  dominant.  With  some  notable  exceptions  (e.g.  snapping  shrimp  and 
breaking  ice),  noise  in  the  acoustic  communication  frequencies  is  well  modeled  by  a 
Gaussian  random  process  [102,  126].  In  general,  the  power  spectral  density  of  this 
process  is  not  flat,  so  the  noise  process  is  not  white. 

Attenuation  is  a  function  of  the  acoustic  path  length  l  and  the  frequency  of  the 
signal  /  [10,  103] 

(2.1.2) 

where  k  is  the  spreading  factor  (cylindrical,  spherical,  etc.),  a(f)  is  the  absorption 
coefficient,  and  lr  is  a  reference  distance.  Thorp  [115,  116]  provided  an  approximate 
expression  for  the  absorption  coefficient  as  a  function  of  frequency  which  is  valid  for 
frequencies  in  the  range  500  Hz  to  50  kHz.  Most  acoustic  communication  frequencies 
are  in  this  range.  The  expression  for  the  absorption  coefficient  is 


10  log a(f)  =  0 .11^^  +  44410(T+  p  +  2‘75 ' 10'4/2  +  °-003’  (2-1-3) 

where  the  quantity  101oga(/)  has  units  of  dB/km  and  /  has  units  of  kHz.  For 
frequencies  lower  than  500  Hz,  the  alternative  expression 

10  log10  a(f)  =  0.11-^-  +  0.011/2  +  0.002  (2.1.4) 

i  +  r 

is  a  better  approximation  [10,  102],  Fisher  and  Simmons  [34]  tied  these  expressions 
to  physical  and  chemical  properties  of  sea  water. 

Using  a  narrowband  approximation  and  ignoring  any  multi-path  effects,  the  SNR 
can  be  approximated  as  a  function  of  frequency  and  distance, 


SNR(l,f) 


P/A(l ,  /) 
N(f)Af 


P 


(2.1.5) 


where  A /  is  the  bandwidth  of  the  receiver  and  hence  the  received  noise  (narrow- 
band  approximation)  [102],  The  frequency  dependent  portion  of  this  expression  is 
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Figure  2.1.1:  Signal  to  Noise  ratio  (narrowband),  l/A(l,f)N(f),  as  a  function  of  fre¬ 
quency  for  various  ranges. 


encompassed  in  the  expression  A(l,  f)N(f)  since  P  is  the  total  source  power  across 
the  available  bandwidth.  Using  a  practical  spreading  coefficient  of  k  —  1.5,  Figure 
2.1.1  shows  the  relationship  between  frequency  and  SNR,  recreated  from  [102];  both 
the  optimal  center  frequency  and  3  dB  bandwidth  are  dependent  on  transmission 
range. 


Jensen  and  Kuperman  [41]  performed  a  related  analysis  to  find  the  optimum  fre¬ 
quency  that  balanced  the  propagation  and  attenuation  mechanisms  of  the  shallow 
water  channel,  but  did  not  account  for  noise  power.  The  optimum  frequency  is  a 
general  feature  of  waveguide  or  ducting  propagation  and  for  shallow  water  channels  is 
strongly  dependent  on  depth  [42],  Typical  optimum  frequencies  when  the  water  depth 
is  100  m  are  200-800  Hz,  lower  than  the  frequencies  found  in  [102]  which  included  the 
noise  characterization. 
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2.1.2  Delay  spread  of  the  channel 


As  sound  travels  from  the  transmitter  to  the  receiver,  it  follows  not  only  a  direct  path, 
but  also  additional  paths  due  to  reflections  from  the  surface,  reflections  from  the  sea 
floor,  and  inhomogeneities  in  the  sound  speed  profile  which  cause  the  refraction  of 
the  sound  paths. 

The  time  difference  between  the  first  received  path  and  the  last  is  referred  to 
as  the  delay-spread  of  the  channel.  The  delay-spread  determines  the  fastest  rate  at 
which  data  can  be  transmitted  without  inter-symbol  interference  (ISI).  Often  the 
length  of  the  delay-spread  is  in  units  of  transmitted  symbol  durations,  the  inverse  of 
the  transmitted  symbol  rate.  Delay  spreads  of  tens  and  hundreds  of  symbols  are  not 
uncommon  in  the  underwater  channel.  This  is  a  stark  contrast  to  the  radio-frequency 
(RF)  channel  which  often  has  on  the  order  of  three  symbols  of  ISI. 

The  delay-spread  of  the  underwater  communications  channel  is  due  to  the  different 
paths  from  the  transmitter  to  receiver.  In  water  shallow  enough  for  an  approximately 
isovelocity  sound  speed  profile  the  delay  spread  induced  by  the  channel  is  due  to 
reflections  from  the  sea  surface,  reflections  from  the  bottom,  and  reflections  from 
anything  in  the  water  column.  These  reflections  are  referred  to  as  macro-multipath 
since  they  are  due  to  macro  features  in  the  environment.  These  features  are  usually 
assumed  to  be  roughly  time-invariant  over  a  time-scale  that  is  much  larger  than  the 
data  signaling  rate  [100]. 

The  acoustic  rays  are  better  modeled  as  three-dimensional  tubes  rather  than  two- 
dimensional  lines.  When  the  tube  encounters  an  object,  the  reflections  are  usually 
not  point  reflections,  but  are  reflections  from  an  area  such  as  a  rough  patch  of  sand 
or  rough  sea  surface.  This  causes  each  ray  path  to  spread  in  time,  sometimes  by 
as  much  as  a  few  milliseconds.  The  multipath  due  to  small  scale  features  such  as 
surface  roughness  and  random  ocean  fluctuations  is  referred  to  as  micro-multipath. 
Micro-multipath  is  non-specular  and  some  components  of  the  small  scale  random 
fluctuations  can  be  modeled  statistically  [100].  In  the  acoustics  literature,  the  micro¬ 
multipath  concept  is  also  known  as  ray-tubes  [42], 
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Figure  2.1.2:  Diagram  depicting  some  of  the  acoustic  paths  from  the  transmitter  to 
the  receiver.  The  black  solid  lines  show  the  path  and  the  blue  cylinder  are  an  example 
of  the  spreading  in  space  that  would  cause  a  spread  in  time  at  the  receiver. 

Figure  2.1.2  shows  an  example  of  what  multipath  might  look  like  in  a  shallow 
water  environment.  The  macro-multipath  is  represented  by  the  different  paths  (solid 
black  lines)  and  the  micro-multipath  is  a  cylinder  around  this  line,  indicating  the 
spreading  radius. 

In  a  deep  water  environment,  in  addition  to  surface  and  (less  commonly)  bottom 
bounces,  fully  refracted  paths  occur  due  to  a  local  minimum  in  the  sound  speed  profile. 
Sound  tends  to  “favor”  regions  with  slower  sound  speed  and  will  bend  towards  those 
regions.  There  is  a  region  of  the  ocean  known  as  the  Sound  Fixing  and  Ranging 
(SOFAR)  channel  or  the  deep  sound  channel  where  the  sound  speed  is  at  a  minimum. 
The  ray  bending  is  due  to  Snell’s  law  applied  to  a  medium  with  a  continuously 
changing  sound  speed.  The  deep  water  channel  can  have  a  large  time-spread  but 
may  be  sparse  and  is  often  slowly-varying  compared  to  the  time  scales  of  equalizer 
adaptation  relevant  for  acoustic  communication. 

Regions  of  little  acoustic  penetration  due  to  system  geometry  and  the  sound  speed 
profile  are  known  as  shadow  zones.  These  regions  can  occur  because  of  obstructions 
( e.g .  sea  mounts)  or  more  importantly  because  of  the  waveguide  propagation  physics. 
Shadow  zones  can  form  in  either  the  deep  or  shallow  water  channels  [42,  119].  There  is 
little  signal  processing  that  can  be  done  to  correct  for  shadow  zones  so  compensation 
for  shadow  zones  is  accomplished  through  system  placement  and  mission  design. 
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2.1.3  Doppler,  waves,  and  motion 

Each  of  the  transmitter-to-receiver  paths  experience  varying  degrees  of  time- variability 
effects  due  to  surface  waves,  internal  waves,  platform  motion,  reflections  from  moving 
objects,  and  currents  and  tides.  Each  source  of  variability  can  induce  either  a  Doppler 
shift,  where  all  of  the  frequencies  are  shifted  up  or  down,  or  a  Doppler  spread,  where 
neighboring  frequencies  are  smeared  together.  A  typical  Doppler  spread  is  on  the  or¬ 
der  of  0  to  30  Hz,  but  depends  heavily  on  the  transmit  frequency  and  communication 
system  parameters  (sea  surface  motion,  weather,  platform  motion,  etc.). 

Because  the  speed  of  sound  is  much  slower  than  the  speed  of  light,  the  Doppler 
effects  observed  in  the  underwater  channel  tend  to  be  much  more  severe  than  in 
RF  channels.  To  illustrate  the  Doppler  effect  more  clearly,  consider  a  pulse,  p(t), 
modulated  with  a  carrier  frequency  of  fc  and  transmitted  through  a  sound  channel 
with  constant  speed  of  sound  cs  to  a  receiver  moving  at  a  constant  velocity  v  with 
respect  to  the  transmitter.  The  propagation  delay  of  the  received  signal  is  r(t)  = 
r0  —  — ,  where  r0  is  a  reference  delay.  The  Doppler  effect  is  proportional  to  a  =  v/cs. 
The  transmitted  signal,  s(t),  is  [104] 

s(t)  =  Re{p(t)ej2nfct}.  (2.1.6) 

The  signal  observed  by  the  receiver  is  [104] 

f- 

r(t )  =  s(t  H - r0)  =  s(t  +  at  —  r0)  =  R e{p(t  +  at  —  T0)e^27r^c('t+at^T'>}.  (2.1.7) 

With  respect  to  the  center  frequency  the  baseband  receive  signal  (he.  the  received 
signal  after  demodulation  and  low  pass  filtering)  is  [104] 

f{t)  =  e-j2nf'Tp(t  +  at  -  T)ej2™fct  (2.1.8) 

Ignoring  the  phase  shift  rfcT,  there  are  two  signal  distortions  observed: 

1.  The  signal  is  dilated  in  time  by  a  factor  of  1  +  a,  i.e.  the  dilated  signal  is  p'{t)  = 
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p(t(  1  +  a)).  A  dilation  the  time  domain  causes  a  corresponding  contraction  of 
the  frequency  response. 

2.  The  signal  has  a  frequency  offset  of  afc,  which  is  known  as  a  Doppler  shift. 

The  dilation  effect  can  be  ignored  when  the  time-bandwidth  of  the  signal  is  appro¬ 
priately  small.  If  the  bandwidth  of  p[t)  is  denoted  as  Bp,  the  signal  is  approximately 
time  invariant  for  a  time  Bpl .  If  the  data  packet  has  total  duration  T ,  then  the  total 
dilation  of  the  packet  is  vT/cs  which  must  be  much  less  than  Bf1  for  dilation  to  be 
ignored.  Therefore,  when  the  time  bandwidth  product,  TBP,  satisfies  the  relation 

c, 

TBP  <  - 
v 

dilation  can  be  ignored;  otherwise,  the  received  signal  needs  to  be  re-sampled  [120]. 

Another  way  to  characterize  the  Doppler  is  through  the  scattering  function.  As¬ 
sume  that  the  channel  coefficient  at  a  particular  time  t  and  delay  r  is  g{t,r).  If  the 
channel  is  known  to  be  wide  sense  stationary  [i.e.  the  correlation  is  a  function  of  the 
time  difference),  then  the  temporal  correlation  function  of  the  channel  is 

R9(Af;  n,  7-2 )  =  V{g(t,  n )g*(t  +  At,  r2)},  (2.1.9) 

which  is  a  function  of  three  variables:  the  two  delays,  T\  and  r2,  and  the  difference  in 
time  At.  The  scattering  function  of  the  channel  is  defined  as  the  Fourier  transform 
of  the  temporal  correlation  function, 

/OO 

Rg(At]  n,  T2)e~j2nXdAtdAt,  (2.1.10) 

-oo 

where  is  the  Doppler  spreading  variable.  At  a  particular  delay,  T\  =  r2,  this  is 
an  expression  of  the  Doppler  spread  of  the  channel.  This  leads  to  the  relationship 
between  the  coherence  time  of  the  channel,  (A t)c,  and  the  Doppler  spread,  Bd ,  [79] 

(At)c«i-.  (2.1.11) 

Dd 
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The  coherence  time,  (A t)c,  i.e.  the  time  over  which  the  channel  at  a  particular  delay 
is  correlated,  is  approximately  the  inverse  of  the  Doppler  spread  of  the  channel,  Bd- 
For  time-invariant  channel  (A t)c  =  oo,  so  there  is  no  Doppler  spread. 

In  addition  to  Doppler  effects,  there  are  other  channel  effects  caused  by  the  waves. 
For  certain  geometries,  the  waves  will  act  as  a  concave  mirror  and  the  focus  the 
acoustic  energy  to  cause  severe  but  brief  changes  in  the  channel  magnitude  and  phase 
characteristics  [78,  74],  These  focusing  events  cause  the  path  that  interacts  with  the 
surface  to  have  a  larger  magnitude  than  the  direct  path  and  can  cause  instantaneous 
7r/4  shifts  in  phase. 

Wave  motion  can  also  inject  bubbles  into  the  water  volume  [77].  These  create 
a  highly  variable  medium  for  the  sound  to  propagate  through  and  can  increase  the 
absorption  coefficient  or  reduce  the  effective  height  of  the  water  column. 


2.2  Channel  model 

An  underwater  communication  systems  consists  of  at  least  one  transmitter  and  one 
receiver.  Figure  2.2.1  shows  an  example  setup  for  an  underwater  acoustic  experiment. 


Figure  2.2.1:  Possible  setup  for  acoustic  model  being  studied  in  this  thesis. 

Received  signals  are  assumed  to  be  sampled  at  baseband,  so  the  analysis  and 
processing  are  done  in  discrete  time  with  complex  valued  signals.  The  acoustic  channel 
is  modeled  with  a  finite  extent,  linear,  time-varying  impulse  response  plus  additive 
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noise  [79,  120].  The  received  signal  sample  at  time  n,  u[n\,  can  be  written  as  [75] 

Nc-l 

u[n\  —  ^  g*[n,k]d[n  — k]  + u[n],  (2-2.1) 

k=-Na 

where  d[n]  is  the  transmitted  data,  is[n\  is  complex,  baseband  noise,  and  g[n,  k]  is  the 
complex,  baseband  channel  impulse  response  equating  the  input  data  at  time  n  —  k 
to  the  output  data  at  time  n.  The  channel  includes  the  transmit  and  receive  filtering 
effects  in  addition  to  the  physical  propagation  effects.  The  channel  is  assumed  to 
have  Nc  causal  coefficients  and  Na  acausal  coefficients,  where  the  center  (zero  offset) 
coefficient  is  assumed  to  correspond  to  the  center  of  the  direct  arrival.  This  definition 
is  particular  to  the  isovelocity  channel;  in  other  environments  the  direct  arrival  may 
not  be  the  first  causal  arrival  ( e.g .  the  direct  arrival  is  the  last  arrival  for  the  SOFAR 
channel).  Eq.  (2.2.1)  can  be  written  more  compactly  as 

u[n]  =  gH[n\d'[n]  +  v[n\,  (2.2.2) 

with 

g[n]  =  [g[n,  Nc-  1]  ...  g[n,  0]  . . .  g[n,  ~Na]]T  (2.2.3) 

and 

d'[n]  =  [d[n  —  Nc  +  1]  ...  d[n]  ...  d[n  +  Na]]T .  (2.2.4) 

Stacking  successive  received  signal  samples,  eq.  (2.2.2)  becomes  a  matrix-vector  equa¬ 
tion, 

u  [n]  =  GH[n]d[n]  +  u[n],  (2.2.5) 

where  u [n]  is  a  vector  of  sampled  received  data,  d [n]  is  the  transmitted  data,  u[n]  is 
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the  sampled  noise  vector,  and  G  [n]  is  the  channel  convolution  matrix.  Specifically, 


u[n]  =  [u[n  —  Lc  +  1]  ...  u[n]  ...  u[n  +  La]]T  (2.2.6) 

d [n]  =  [d[n-Lc-Nc  +  2]  ...  d[n  +  La  +  Na}f  (2.2.7) 

u[n]  =  [z/[n  —  Lc  +  1]  ...  v[n\  ...  u[n  +  La]]T  (2.2.8) 

g\n  —  Lc  +  1,  —  iVc  +  1]  0  •  ■  •  0 

<?[n  —  Lc  +  1,  —  iVc  +  2]  g[n  —  Lc  +  2,  —  JVC  +  1]  •  •  •  0 

G[n]  =  .  . 

0  0  •••  g[n  +  LaNa\ 

=  s'_lc+i [n  —  Lc  +  1]  g 'La[n  +  La]  ,  (2.2.9) 


where  La  and  Lc  are  the  number  of  acausal  and  causal  feedforward  equalizer  taps 
in  each  feedforward  section  of  a  decision  feedback  equalizer.  The  transmitted  data 
symbols  are  assumed  to  be  drawn  from  a  zero-mean  random-process  with  variance 
crj.  The  noise  and  transmitted  data  correlation  matrices  are  defined  as 

R„  =  E{u[n}uH[n]}  (2.2.10) 

Rd  =  E{d[n]dH[n]}  =  a2d  I  =  I.  (2.2.11) 

The  last  equality  for  Rd  highlights  an  assumption  used  for  the  remainder  of  this 
thesis  (unless  otherwise  noted)  that  the  transmitted  data  symbols  are  white  with 
unit  energy,  i.e.  ad  —  1. 

Each  of  the  columns  of  the  matrix  G  [n] ,  denoted  above  as  g(  [m] ,  is  the  channel 
impulse  response  vector  at  time  m,  g [m],  padded  with  zeros  so  the  matrix  multiplica¬ 
tion  G^  [n]  d [n  is  equivalent  to  the  convolution  of  the  channel  impulse  response  with 
the  transmitted  data  from  eq.  (2.2.1). 

To  handle  fractionally  spaced  sampling  (more  than  one  sample  per  symbol),  the 
number  of  rows  of  G  [n]  is  increased  proportional  to  the  fractional  sampling  rate 
(number  of  samples  per  symbol).  The  length  of  the  noise  and  received  data  vec¬ 
tors  is  increased  accordingly.  The  length  of  the  transmitted  symbol  vector  remains 
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unchanged.  Except  when  noted,  symbol  rate  sampling  is  assumed  for  notational  sim¬ 
plicity;  it  is  straightforward  to  extend  the  results  to  handle  fractional-rate  sample 
spacing. 


2.3  Least-squares  estimation 

2.3.1  Setup 

Many  problems  from  signal  processing,  communications,  and  control  have  a  similar 
form:  estimate  the  vector  w  when  the  statistics  of  the  random  vector  x  and  random 
variable  y  are  known,  v  is  random  noise,  and 

y  =  wux.  +  v, 

Since  the  noise  is  random  it  is  not  possible  to  estimate  w  exactly  so  a  metric  is  needed 
to  determine  the  quality  of  the  estimate.  An  extremely  common  metric  is  the  mean 
squared  error  metric  which  is  the  expected  absolute  difference  squared  between  y  and 
ye st  =  w^tx  where  west  is  the  estimate  of  w.  The  minimization  problem  using  this 
metric  is 

Wmmse  =  argminE{||/  -  w^tx|2}.  (2.3.1) 

West 

The  solution  to  this  minimization  problem  is  the  minimum  mean  squared  error 
(MMSE)  solution 


wmmse  =  E{xx"}  ^{xy*}  =  Rx!rX!/  =  Prxy,  (2.3.2) 

where  Rx  =  Ejxx^},  rxy  =  E{xy*},  and  P  =  Rxx.  Assuming  both  x  and  y  are 
zero-mean,  this  solution  provides  the  unbiased  estimate  of  the  parameter  vector  w 
with  the  minimum  mean  squared  error.  The  solution  can  be  modihed  to  handle  the 
non-zero  mean  case  as  well.  In  the  communication  context,  the  MMSE  problem  setup 
is  used  for  channel  estimation  and  for  equalizer  coefficient  estimation. 

When  the  statistics  of  x  and  y  are  not  known,  but  there  are  observations  available 
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(i.e.  x1;  y1,  x2,  i/2,  ■  ■  ■ ,  xN,  yN ),  the  observed  data  can  be  arranged  as 


xf 

in 

X  = 

X? 

y  = 

y-2 

Vn 

and  least  squares  methods  can  then  be  used.  The  least  squares  problem  is  related  to 
the  MMSE  and  has  a  very  similar  form.  In  the  least  squares  problem,  a  vector  w  is 
sought  which  solves 

Xw  =  y. 

When  the  known  matrix  X  is  tall,  i.e.  has  more  rows  than  columns,  usually  there  is 
no  exact  solution.  A  tall  matrix  also  indicates  there  are  more  observations  y  than 
parameters  w.  The  least  squares  estimate  minimizes  the  squared  error  between  Ax 
and  y, 

wls  =  argmin  |y  —  Xw|2 .  (2.3.3) 

W 

The  solution  to  this  minimization  problem  is 

wLS  =  (XHX)  ~1XHy  =  XV-  (2.3.4) 

The  quantity  X*  =  (X^X)-1  XH  is  the  Moore-Penrose  pseudo- inverse  of  the  matrix 
X.  If  the  random  variables  x  and  y  are  Gaussian  distributed,  the  least  squares  solution 
is  the  maximum  likelihood  solution. 


The  MMSE  framework  can  also  be  used  to  estimate  w  for  a  random  vector  y  and 
random  matrix  X  which  are  related  by 

y  =  wHX  +  is, 

where  is  is  a  vector  of  random  noise.  Using  the  mean  squared  error  cost  function,  the 
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minimization  problem  in  this  case  is 


wmmse  =  arg  min  E{  |  y  —  w^tX  | 2  } .  (2.3.5) 

West 

and  the  solution  is 

wmmse  =  E{XX^}_1E{Xy*}.  (2.3.6) 

Notice  that  eq.  (2.3.6)  has  the  same  form  as  eq.  (2.3.2)  with  the  matrix  X  and  vector 
y  from  eq.  (2.3.6)  replacing  the  vector  x  and  scalar  y  in  eq.  (2.3.2). 

In  many  situations,  one  will  have  a  least-squares  solution  and  new  data  will  arrive 
( e.g .  xjv+i).  Given  a  new  observation,  the  new  least  squares  solution  which  incorpo¬ 
rates  the  new  data  can  be  computed  efficiently  without  recomputing  the  whole  solu¬ 
tion.  One  particularly  effective  method  for  computing  a  data-recursive  least-squares 
solution  is  the  recursive-least-squares  (RLS)  algorithm. 

2.3.2  Recursive  least  squares  (RLS)  filtering  algorithm 

The  statistics  of  the  underwater  channel  are  often  not  available  and  there  is  not  yet 
an  agreed  upon  model  of  the  time  variation  of  the  underwater  acoustic  communica¬ 
tion  channel.  The  underwater  channel  is  often  assumed  to  be  varying  “reasonably” 
slowly  so  that  the  time-varying  channel  impulse  response  coefficients  can  be  tracked. 
This  assumption  enables  the  use  of  the  exponentially  weighted  recursive  least-squares 
(EW-RLS)  algorithm  for  estimating  the  channel  or  equalizer  coefficients.  This  algo¬ 
rithm  is  a  balance  between  computational  complexity  and  effectiveness  since  it  can 
track  a  time- varying  channel  effectively  with  reasonable  complexity,  O (iV2),  where 
the  quantity  of  interest  has  N  parameters  (either  the  channel  impulse  response  or  the 
equalizer  coefficients).  The  notation  O(-)  refers  to  the  highest  order  of  the  computa¬ 
tion  complexity. 

The  EW-RLS  algorithm  provides  an  effective  way  to  estimate  the  ensemble  ex¬ 
pectations  in  the  solution  to  the  LSE  equalizer  equations.  This  section  briefly  covers 
the  algorithm  with  a  quick  derivation  and  a  focus  on  the  practical  details  for  the 
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underwater  channel.  For  more  of  the  algorithmic  details,  Haykin  [36]  and  Sayed  [86] 
are  excellent  resources. 

The  EW-RLS  algorithm  approximates  the  expectations  in  the  MMSE  solution 
with  time-averaged  functions  of  the  data, 


n 


Rx[n]  =  P  1  =  ^An-’x[i]x[i]ff  +  5An+1I 

i= 0 

(2.3.7) 

n 

wjn]  =  n"*x[iMi]*. 

i= 0 

(2.3.8) 

In  these  equations,  A  is  the  exponential  weighting  factor,  0  <  A  <  1.  The  term  hA^I 
is  added  to  the  denominator  term  for  regularization  so  the  algorithm  is  initially  well 
behaved  (6  is  a  system  design  parameter).  This  algorithm  provides  a  computationally 
efficient  data-recursive  method  for  updating  the  parameter  estimates  [36].  When  a 
new  values  x[n]  and  y[n]  are  received,  the  RLS  algorithm  for  updating  the  estimates 
w  [n  —  1]  and  P  [n  —  1]  is 


w[0] 

=  0 

(2.3.9) 

P[0] 

=  r1 1 

(2.3.10) 

7r[n] 

=  P[n—  l]x[n] 

7r[n] 

(2.3.11) 

k[n] 

_  L  J 

A  +  x^[n]7r[n] 

(2.3.12) 

CM 

=  y[n\  —  w  H[n  —  l]x[n] 

(2.3.13) 

w[n] 

=  w[n  —  1]  +  k[n]C*M 

(2.3.14) 

P[n] 

=  (I  —  k[-n]xif[n])A_1P[n  —  1]. 

(2.3.15) 

The  value  of  exponential  weighting  factor  determines  the  size  of  the  data  averaging 
window  or  memory  of  the  algorithm.  The  memory  of  the  algorithm  is  approximately 
[29].  A  common  rule  of  thumb  for  the  underwater  channel  is  that  the  algorithm 
memory  should  be  approximately  two  to  three  times  the  number  of  coefficients  being 
estimated.  For  example,  when  estimating  a  channel  that  spans  100  symbols,  A  ~  0.995 
for  a  window  of  twice  the  channel  length. 
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The  EW-RLS  has  been  shown  to  converge  to  the  Weiner  solution  asymptotically 
when  the  coefficients,  w [n] ,  are  time-invariant  and  A  — >  1  [21].  The  cost  function  that 
is  actually  being  solved  using  the  EW-RLS  algorithm, 


w[n|  =  argmm 

W 


An+1  5  w 


H 


n 


w  +  2^  xn~l 
1=0 


y\f[  - 


w^xfi] 


(2.3.16) 


is  slightly  different  from  the  cost  function  for  the  MMSE  estimation  problem  [86]. 
The  EW-RLS  algorithm  can  also  be  written  as  a  constrained  form  of  the  Kalman 
filter  [87,  88]. 


2.4  Equalization 

An  equalizer  is  a  structure  used  to  mitigate  ISI  and  channel  distortions  in  the  received 
signal.  This  thesis  focuses  on  two  particular  types  of  equalizers:  the  linear  equalizer 
(LE)  and  the  decision  feedback  equalizer  (DFE).  The  coefficients  for  both  of  these 
equalizer  structures  can  be  found  using  a  least-squares  type  of  optimization  criterion. 
The  output  of  the  equalizer  filter  is  an  estimate  of  the  transmitted  symbol, 

d[n]  =  hH[n\z[n\.  (2-4.1) 

The  vector  z  [n]  either  contains  only  received  signal  samples  (for  the  LE)  or  received 
signal  samples  and  estimates  of  past  data  symbols  (for  the  DFE).  An  example  of  both 
the  LE  and  the  DFE  are  shown  in  Figure  2.4.1. 

The  cost  function  which  is  minimized  to  find  the  equalizer  coefficients  is 

J(h)  =  E{|hHz  —  d|2},  (2.4.2) 

and  the  optimization  problem  to  find  the  MMSE  equalizer  coefficients,  hopt  [n] ,  is 
represented  as 

hopt  =  argmiriE{|h^z  —  d|-}.  (2.4.3) 

h 
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(a)  (b) 


Figure  2.4.1:  Illustration  of  the  (a)  linear  equalizer  and  (b)  decision  feedback  equalizer 
with  the  quantities  h  and  z  labeled. 

The  solution  to  this  minimization  is 

hopt[n]  =  Rz-1[n]rzd[n],  (2.4.4) 

where  Rz[n]  =  Ejzz^}  and  rz^  =  E{z d*}.  Solving  for  the  equalizer  by  directly 
estimating  Rz  and  rZ(f  from  the  received  signal  and  possible  past  data  estimates  is 
referred  to  as  Direct  Adaptation  equalization  (DA).  Assuming  the  expectations  are 
conditioned  on  a  known  channel,  eq.  (2.2.5)  can  be  substituted  into  eq.  (2.4.4)  to 
reduce  the  solution  to  a  function  of  the  channel  impulse  response  values  and  the  noise 
statistics.  This  method  is  referred  to  as  the  channel  estimate  based  (CEB)  method 
of  equalization.  In  the  following  subsections,  the  coefficients  for  the  DA  and  CEB 
methods  of  both  the  LE  and  DFE  are  derived. 

2.4.1  Linear  equalizer  (LE) 

The  linear  equalizer  uses  a  linear  combination  of  the  received  signal  samples  to  create 
an  estimate  of  the  transmitted  symbol.  This  structure  tends  to  have  low  compu¬ 
tational  complexity,  so  is  often  used  in  computation  limited  environments  such  as 
embedded  systems.  The  LE  algorithm  is  a  natural  place  to  start  theoretical  deriva¬ 
tions  due  to  its  simple  form. 

The  performance  of  the  LE  algorithm  suffers  when  there  are  nulls  in  the  channel 
frequency  response  [36].  The  LE  algorithm  attempts  to  invert  the  nulls,  and  in  doing 
so  greatly  amplifies  the  noise  power  at  the  nulls  which  degrades  equalizer  performance. 
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This  behavior  makes  the  LE  algorithm  less  than  ideal  for  frequency  selective  channels. 


The  coefficients  of  the  LE  can  be  found  using  eq.  (2.4.1)  by  setting  z[n]  equal  to 
the  vector  of  received  signal  samples,  u [n] , 

z [n]  =  [u[n  —  Lc  +  1]  ...  u[n]  . . .  u[n  +  La]]7  =  u[n].  (2.4.5) 

The  number  of  equalizer  coefficients  is  ( La  +  Lc)rfs  where  La  and  Lc  are  the  number 
of  acausal  and  causal  equalizer  taps  respectively  and  rfs  is  the  number  of  samples 
per  symbol  (the  fractional  sampling  rate).  Substituting  the  expression  for  z [n]  from 
eq.  (2.4.5)  into  eq.  (2.4.4)  produces  the  expression  for  the  LE  coefficients, 

hopt  =  hiin  =  E{u[n]u^[n]}_1E{u[n]d*[n]}.  (2.4.6) 

In  this  equation,  Ru  =  Ejujnju^jn]}  is  the  received  signal  autocorrelation  matrix 
and  rud  =  E{u[n]d*[n]}  is  the  cross  correlation  vector  between  the  received-signal  and 
the  transmitted-symbol.  When  the  statistics  are  not  known,  the  expectations  must 
be  estimated  form  the  available  data.  Using  an  exponentially  weighted  window,  the 
estimates  of  these  quantities  are  of  the  form 

n 

Ru [n]  =  8n I  +  An"*u[i]uH[i]  (2.4.7) 

i= 0 
n 

rud[n]  =  J^An_*u  [i\d*\i\,  (2.4.8) 

i= 0 

where  A  is  the  exponential  weighting  factor.  The  regularization  term,  Sn I,  is  included 
to  ensure  the  matrix  is  well  conditioned.  Using  the  estimated  autocorrelation  matrix 
and  cross-correlation  vector  is  the  DA  method  of  linear  equalization.  The  EW-RLS 
algorithm  provides  an  computationally  efficient,  data-recursive  method  for  updating 
the  equalizer  coefficients. 

An  alternative  to  using  these  estimated  quantities  is  to  use  the  channel  model  from 
eq.  (2.2.5).  The  expectation  in  eq.  (2.4.6)  can  be  evaluated  conditioned  on  knowing 


48 


the  true  channel,  i.e. 


hopt  =  hUn  =  E{u[n]u^[n]|G[n]}  1E{u[n]d"t[n] \G[n]}. 

Conditioning  on  the  known  channel  coefficients  will  be  implicit  for  the  remainder  of 
the  thesis  and  not  explicitly  included  in  the  expectation  terms  for  brevity.  Using  the 
channel  model  from  eq.  (2.2.5)  in  eq.  (2.4.6),  the  expression  for  the  LE  coefficients 
becomes 

hlin  [n]  =  [G*[n]G[n]  +  R^G^nJs.  (2.4.9) 

In  the  above  relation,  s  is  a  selection  vector,  the  same  length  as  d [n] ,  that  selects  the 
row  of  the  channel  convolution  matrix,  G  [n] ,  corresponding  to  the  data  symbol  being 
estimated.  This  row  is  referenced  using  the  symbol  [n] .  The  selection  vector  is 
s=  [0  0  ...  1  ...  0  0]T,  where  the  1  is  located  at  the  symbol  being  estimated,  d[n], 
in  the  transmitted  symbol  vector  d  [n] . 


One  rarely  has  access  to  the  true  channel  impulse  response  coefficients,  and  so  the 
channel  is  estimate  from  the  received  data.  When  a  channel  estimate,  G  [n]  is  used 
in  place  of  the  true  channel  to  compute  the  equalizer  coefficients,  this  is  the  CEB 
method  of  linear  equalization. 


The  error  in  estimating  the  transmitted  data  after  equalization,  CleM,  is  referred 
to  as  the  soft  decision  error  (SDE), 


eLE[n]  =  hhnHu  -  d.  (2.4.10) 

The  mean  squared  error  (MSE)  is  the  expectation  square  of  the  absolute  value  of  the 
SDE, 

MSE  =  E{|eLE[n]|2}.  (2.4.11) 

The  term  MSE  will  also  be  used  to  refer  to  the  time  averaged  observed  squared  error 
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V 

(noise) 


(noise) 


Figure  2.4.2:  Schematic  representation  of  LE  methods:  (a)  Direct  Adaptation 
(b)  Channel  Estimate  Based 


where  the  expectation  is  replaced  with  an  empirical  average, 

_  i  M 

MSE  =  jj  leLENI2- 

m=  1 

The  minimum  achievable  error  (MAE)  is  found  by  substituting  eq.  (2.2.5)  and 
eq.  (2.4.9)  into  eq.  (2.4.11).  The  expression  for  the  MAE  is 

MAE  =  <B>]  =  1  -  gJ[n][GH[n]G[n]  +  RJ-'gSIn].  (2.4.12) 


2.4.2  Decision  feedback  equalizer  (DFE) 

A  DFE  consists  of  two  linear  filters  working  in  concert:  a  feedforward  section  that  fil¬ 
ters  the  received  signal  and  a  feedback  section  that  hlters  past  data  symbol  estimates. 
The  purpose  of  the  feedforward  filter  is  to  collect  energy  and  shape  the  response  of 
the  received  signal.  The  feedback  filter  is  used  to  cancel  causal  ISI  by  removing  inter- 
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ference  from  past  received  data  symbols  from  the  received  signal.  By  removing  the 
causal  ISI,  the  feedback  filter  increases  the  effective  signal-to-interference-plus-noise 
ratio  (SINR).  This  is  one  reason  a  DFE  outperforms  a  LE.  The  sum  of  output  of  the 
feedforward  ad  feedback  filters  gives  an  estimate  of  the  transmitted  symbol.  Proakis 
[79]  and  Qureshi  [81]  are  good  references  for  an  overview  of  the  DFE  algorithm. 

The  feedforward  and  feedback  equalizer  coefficients  are  labelled  respectively  as  hg 
and  hfb.  For  the  DFE,  z[n]  and  h[n]  from  eq.  (2.4.4)  are  defined  as 

u[n  +  La\ ,  d[n  —  1]  ...  d[n  —  Tft,]]T  (2.4.13) 

(2.4.14) 

where  d[m]  is  the  estimated  transmitted  data  at  time  m  and  Lfb  is  the  number  of 
feedback  equalizer  coefficients.  Using  the  above  definitions  for  z  and  h  in  eq.  (2.4.9), 
the  optimal  DFE  coefficients  are 

E{zzH}~1E{zd*},  (2.4.15) 

where  Rz  =  E{z[n]zH  [n]}  is  the  autocorrelation  matrix  of  z[n]  and  r zci  =  E{z[n]d*[n]} 
is  the  cross-correlation  vector  between  elements  of  z  [n]  and  the  transmitted  data 
symbol.  Using  an  exponentially  weighted  window,  the  estimates  of  these  quantities 
are  of  the  form 

n 

Rz [n]  =  8n I  +  An_iz[i]z^[i]  (2.4.16) 

i= 0 
n 

r zd[n }  =  An-tz \i]d*[i\  (2.4.17) 

i=0 

where  A  is  again  the  exponential  weighting  factor.  When  the  estimated  auto-correlation 
matrix  and  cross-correlation  vector  is  used,  this  is  known  as  the  DA-DFE  algorithm. 

In  a  DFE,  previously  estimated  transmitted  data  symbols  are  used  to  estimate 
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the  current  symbol.  The  rows  of  the  channel  convolution  matrix,  G  [n] ,  corresponding 
to  the  symbols  that  are  used  in  the  feedback  stage  of  the  DFE  are  assembled  into  a 
new  matrix,  labeled  Gn-,  [n] .  Rows  corresponding  to  symbols  not  used  in  the  feedback 
portion  are  placed  in  a  matrix  G0[n],  referred  to  as  the  reduced  channel  convolution 
matrix.  If  all  of  the  previously  estimated  symbols  are  used  in  the  feedback  stage,  the 
channel  convolution  matrix  can  be  separated  as 


G[n 


Gfb[n] 
G0  [n] 


(2.4.18) 


Other  matrices  are  introduced  which  have  the  same  dimensions  as  the  channel 
convolution  matrix.  These  matrices  will  typically  be  added  to  the  channel  convolution 
matrix  during  different  derivations.  To  simplify  notation,  the  subscript  ‘ft,’  will  refer 
to  columns  of  these  matrices  in  the  same  positions  as  the  columns  of  the  channel 
convolution  matrix  which  correspond  to  symbols  which  are  used  in  for  feedback.  The 
subscript  ‘o’  will  refer  to  the  reduced  matrix  comprised  of  the  remaining  columns. 

The  channel  model  from  eq.  (2.2.5)  can  be  used  to  create  an  alternative  to  the 
estimate  from  eq.  (2.4.15).  Using  the  separation  of  the  channel  convolution  matrix  in 
eq.  (2.4.18),  the  channel  model  becomes 


u[n]  =  G^njdfn]  +  u[n]  =  G^[n]d0[n]  +  G^[n]dfb[n]  +  v[n], 


(2.4.19) 


where  dfb  [n]  correspond  to  the  transmitted  symbol  positions  used  in  the  feedback 
section.  The  remainder  of  the  symbols  from  d [n]  are  assembled  in  do  [rt] .  The  DFE 
coefficients  can  be  expressed  as  a  function  of  the  channel  coefficients  by  substituting 
eq.  (2.4.19)  into  eq.  (2.4.15).  The  expressions  for  the  feed  forward  and  feedback 
coefficients  are  [75] 

hfr  [n]  =  [Gjf[n]G0[n]  +  R„]_1G^[n]s 
hfb  [n]  =  -Gfo[n]hs[n]. 
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(2.4.20) 

(2.4.21) 


V 


Figure  2.4.3:  Schematic  representation  of  DFE  methods:  (a)  Direct  Adaptation 
(b)  Channel  Estimate  Based 


The  CEB  method  for  estimating  the  DFE  coefficients  involves  using  channel  es¬ 
timates  to  build  the  channel  convolution  matrices  in  eq.  (2.4.20).  Figure  2.4.3  shows 
both  the  DA  and  the  CEB  forms  of  the  DFE.  For  the  DFE  the  soft  decision  error  is 

eDFE [n]  =  hEFE[n}z[n)  -  d[n\  =  hffH[n]u[n]  +  hfb^njdfbfn]  -  d[n\.  (2.4.22) 

Given  g0[n]  =  G^[n]s,  the  MAE  is 

^IdfeW  =  E{|eDFE N|2}  =  1  -  go  [n][G^[n]G0[n]  +  R^gJM  (2.4.23) 

The  MAE  is  usually  not  achieved  when  the  channel  is  unknown  and  must  be 
estimated.  In  this  case,  the  MSE  is 

MSE  =  E{|eDFE[n]|2}. 
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Since  the  statistics  of  the  error  are  generally  not  known,  they  must  be  estimated  from 
the  data.  The  averaged  observed  error, 

1  M 

MSE  —  ^  |eDFEN|2, 

m= 1 

is  also  referred  to  as  the  MSE. 

The  MAE  of  the  LE  with  the  MAE  of  the  DFE  can  be  compared  by  substituting 
the  channel  convolution  matrix  from  eq.  (2.4.19)  into  eq.  (2.4.12),  which  gives  the 
expression  (time  indexes  dropped  for  clarity) 

=  l-si'G[GS'G0  +  G£Gfc  +  RJ-IGi'S  (2.4.24) 

=  y^  +  hfll  +  WQ-y'Wh,.  (2.4.25) 

In  this  relation,  W  =  G^Gfb,  Q  =  G^G0  +  R„,  and  hff  is  eq.  (2.4.20).  Both  R„ 
and  G^G0  are  Hermitian  and  positive-definite  (assuming  Go  ^  0),  so  Q  is  positive 
definite.  W  is  a  positive  semi-definite  matrix  equal  to  zero  when  Gfb  =  0. 

hf[I  +  WQ  t'Whff  >  0. 

Thus,  the  MAE  of  the  DFE  is  always  less  than  the  MAE  of  the  LE,  except  when 
either  Gfb  =  0  or  the  feedforward  equalizer  coefficients  are  in  the  null  space  of  Gfb, 
which  is  not  common. 
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2.5  Channel  estimation 


The  underwater  channel  is  well  modeled  by  a  finite  number  of  linear  coefficients. 
When  all  of  the  statistics  of  the  transmitted  and  received  data  are  known,  these 
coefficients  can  be  estimated  using  (linear)  MMSE  methods.  Assuming  the  channel 
coefficients  are  time  invariant,  the  MMSE  channel  coefficients  are  a  solution  to  the 
optimization  problem 


gopt  =  arg  min  E{  | d [n]  -  u[n]  |2},  (2.5.1) 

g 

where  d [n]  is  a  vector  of  transmitted  data,  u[n]  is  the  received  data,  and  g  are  the 
channel  coefficients.  The  solution  to  this  optimization  has  the  form 


Smmse  ~  j  (2.5.2) 

where 


Rd  =  E{d[n]d^[n]} 

(2.5.3) 

rdu  =  E{d[n]w*[n]}. 

(2.5.4) 

In  practice,  the  statistics  of  the  transmitted  and  received  data  are  not  known 
fully  and  must  be  estimated.  A  method  known  as  least  squared  error  (LSE)  channel 
estimation  is  often  used.  In  this  method  the  true  expectations  are  replaced  with 
the  observed  time-averages.  As  the  number  of  samples  increases,  the  time-averages 
will  converge  to  the  true  solution.  Therefore,  when  the  channel  is  time-invariant  and 
a  sufficient  number  of  channel  observations  are  available,  MMSE  and  LSE  channel 
estimation  are  practically  equivalent. 

After  N  symbols  have  been  received,  the  LSE  channel  estimate  has  the  form 

§LSE  =  R,d  Tdn,  (2.5.5) 
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where 


Rd=i^d[s]d"[s]  (2,5,6) 

i= 1 

1  N 

'du=  (2-5-7) 

i= 1 

Using  an  EW-RLS  algorithm,  the  estimates  are 

N 

Rd  =  5N I  ^  A^dfld^i]  (2.5.8) 

i=l 

N 

rd»  =  y^^~*d  [»]«*[»]■  (2.5.9) 

i= 1 

Scaling  factors  common  to  both  estimates  are  not  included  here  since  they  will  even¬ 
tually  cancel  out. 

Both  the  MMSE  and  LSE  channel  estimators  are  unbiased.  Under  the  assumption 
that  the  noise  is  zero-mean,  the  estimate  of  the  channel  is  also  unbiased,  even  if  the 
number  of  coefficients  in  the  model  differs  from  the  number  of  channel  coefficients  in 
the  true  channel. 

For  example,  consider  the  following  scenario:  the  true  channel  has  N  coefficients 
in  length  and  the  modeled  channel  estimator  only  contains  N  —  1  coefficients.  The 
true  channel  is  unknown  and  time- invariant.  If  the  true  channel  is  written  as 

g[n]  =  [g[n,  0]  ...  g[n,  N  -  1]]T, 

the  truncated  channel  is 

g'N  =  [g[n,  0]  . . .  g[n,  N  —  2]]T, 
and  the  truncated  transmitted  data  vector  is 

d7[n]  =  [d[n  —  N  +  2]  ...  d[n}\. 
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Using  the  linear  model  from  eq.  (2.2.2)  the  channel  estimate  is 


Sreduc.ed  \p\ 


1  I'd'  I" d'u 

E{d'  [n]  d'H[n]}  1 E  {  d'  [n]  u*  [n] } 

Ijv-  i  E{  d;  [n]  (g^  [n]  d  [n]  +  v[n\)*} 

E{d'  [n]  dH  [n]  g  [n] }  +  E{d'  [n]  u*  [n] } 

In-i  Oat_ixi  r  . 

g[n]  +  0 

OlxAf-l  0 

g'N 


(2.5.10) 


Therefore,  the  channel  estimate  is  unbiased  (for  the  modeled  parameters)  even  when 
the  number  of  channel  coefficients  in  the  model  differs  from  the  number  of  channel 
coefficients  in  the  true  channel. 


2.6  Summary 

In  this  chapter,  topics  included  difficulties  of  the  underwater  channel,  linear  estima¬ 
tion,  equalization,  and  channel  modeling.  The  remainder  of  this  thesis  focuses  on  the 
DFE  where  the  coefficients  are  calculated  from  the  data  using  an  EW-RLS  algorithm. 
Effects  that  are  specific  to  the  underwater  channel  are  used  to  examine  and  improve 
the  performance  of  the  DFE  for  underwater  channels. 
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Chapter  3 


Effective  noise  correlation  matrix: 
Equalizer  improvements  through  a 
structured  matrix 

3.1  Introduction 

Often  overlooked  in  the  CEB-DFE  formulation  is  there  are  two  quantities  needed  to 
calculate  the  equalizer  coefficients:  an  estimate  of  the  channel  impulse  response  and  an 
estimate  of  the  effective  noise  correlation  matrix  [81].  The  effective  noise  includes  both 
the  observation  noise  and  terms  due  to  channel  modeling  errors.  This  effective  noise 
correlation  matrix  is  usually  approximated  as  a  scaled  identity  matrix  with  a  scaling 
equal  to  the  inverse  signal  to  noise  ratio  (SNR).  [79,  91].  For  the  underwater  channel, 
this  turns  out  to  be  a  poor  estimate.  Preisig  [75]  demonstrated  theoretically  and 
experimentally  that  using  a  full  estimate  of  the  effective  noise  correlation  matrix  for 
computing  the  equalizer  coefficients  reduces  the  mean  squared  error  after  equalization. 

In  a  shallow  water  communication  channel,  neighboring  channel  impulse  response 
coefficients  often  exhibit  correlated  fluctuation  [103].  Figure  3.1.1  shows  an  example 
of  a  measured  time-varying  impulse  response  from  the  surface  processes  and  acous¬ 
tic  communication  experiment  (SPACE08)  in  2008.  Notice  that  the  amplitudes  of 
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Figure  3.1.1:  The  magnitude  of  an  estimated  channel  impulse  response  at  1  km  from 
the  transmitter  (SPACE08  experiment). 

neighboring  channel  coefficients  rise  and  fall  together  in  time. 

The  work  presented  in  this  chapter  shows  that  correlated  fluctuations  are  respon¬ 
sible  for  the  effective  noise  correlation  matrix  having  a  non-diagonal  structure.  The 
correlation  matrix  is  shown  to  be  well  approximated  by  a  Toeplitz  matrix,  which  leads 
to  a  computationally  efficient  algorithm  for  computing  the  DFE  filter  coefficients. 

3.2  Channel  estimate  based  DFE 

The  DFE  is  widely  used  in  the  underwater  environment  because  it  is  a  computation¬ 
ally  tractable  way  to  mitigate  channel  effects  [110].  Recall  from  Section  2.4  that  the 
coefficients  of  the  decision  feedback  equalizer  have  the  form 

hffW  =  (Gq  [n]G0[n\  +  cr)"2Rv)-1g0  (3.2.1) 

hfb[n]  =  -G^[n]hff[n].  (3.2.2) 

Figure  3.2.1  shows  a  block  diagram  of  the  structure  of  the  CEB-DFE,  where  estimates 
of  the  channel  are  used  in  place  of  the  true  (unknown)  channel. 

For  terrestrial  RF  communication  systems,  the  observation  noise,  v  [n] ,  is  assumed 
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d[n] 


Figure  3.2.1:  Illustration  of  the  structure  of  a  CEB-DFE. 


to  be  a  stationary,  zero-mean,  white  noise  process  with  variance  a 2  [79].  This  implies 
that  the  observation  noise  correlation  matrix,  a^2 is  a  scaled  identity  matrix  [91], 
such  that  Rv  =  pi,  where  p  is  defined  as  the  inverse  SNR, 

P=%  (3-2.3) 


3.3  Structure  of  the  effective  noise  correlation  ma¬ 
trix 


In  underwater  communication  systems  the  channel  coefficients  are  rarely  known  a- 
priori  and  must  be  estimated  from  the  received  data.  Due  to  observation  noise  and 
the  time-variability  of  the  channel,  the  estimate  of  the  channel  usually  contains  some 
error.  This  estimation  error  can  be  represented  as 

G[n]  =  G[n-  1]  +T[n],  (3.3.1) 

where  G  [n  —  1]  is  the  estimate  of  the  channel  convolution  matrix  using  data  up  until 
time  n  —  1  and  l  n|  is  the  error  in  the  estimate.  Using  this  model,  the  received  data 
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vector  can  be  rewritten  as 


u  [n]  =  GH[n]d[n]  +  u[n\ 

=  G  H[n  —  l]d[n]  +  TH[n]d[n]  +  u[n] 

=  G  H[n  —  l]d[n]  +  n[n],  (3.3.2) 

where  /la  [n  is  the  effective  noise, 

/l i[n]  =  rH[n]d[n]  +  u[n\.  (3.3.3) 

The  hrst  term  in  the  effective  noise  includes  noise  due  to  channel  estimation  errors 
and  the  second  term  is  the  observation  noise. 

In  much  of  the  literature  on  equalization  the  effective  noise  is  modeled  as  a  scaled 
identity  matrix.  Preisig  [75]  demonstrated  that  the  performance  of  a  DFE  can  be 
greatly  improved  by  calculating  the  equalizer  coefficients  using  an  estimate  of  the 
correlation  matrix  of  the  effective  noise  computed  from  the  signal  estimation  residual 
error.  Assuming  that  the  transmitted  data  symbols  are  IID  with  variance  aj,  the 
effective  noise  correlation  matrix,  RM,  is 

R  fj,[n]  =  E  {ti[n\nH[n]} 

=  E{  (Tn  [n]  d  [n]  +  v  [n] )  (TH  [n]  d  [n]  +  v  [n] ) H  } 

=  crjRr[n]  +  Rv[n],  (3.3.4) 

where 

Rr[n]  =  E{r^[n]r[n]}  (3.3.5) 

is  the  channel  estimation  error  correlation  matrix. 

When  the  MMSE  channel  estimate  is  used,  the  error  is  zero-mean  and  uncorrelated 
with  the  estimator.  The  feedforward  and  feedback  DFE  equalizer  coefficients  can  be 
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written  as 


hff  [n]  =  (G^[n  —  1]G0 [n  -  1]  +  Rr [n]  +  ad  2RV [n] )  ^ 

hfb[n]  =  —Gfo[n  —  l]hff[n].  (3.3.6) 

The  errors  for  the  entire  channel  convolution  matrix  are  contained  in  Rr,  so  the  effect 
of  channel  estimation  errors  in  the  feedback  equalizer  coefficients  are  contained  in  the 
term  hg. 

The  feedback  equalizer  coefficients  have  the  same  form  as  before  with  the  estimate 
used  in  place  of  the  true  channel  coefficients.  An  additional  term  has  appeared  in  the 
feedforward  equalizer  coefficients  that  is  a  product  of  the  channel  convolution  matrix 
error  terms.  The  next  section  analyzes  the  structure  of  the  channel  convolution  error 
matrix  and  explains  why  there  are  off-diagonal  terms  in  the  underwater  channel. 

3.4  Why  there  are  off-diagonal  terms  in  the  effec¬ 
tive  noise  correlation  matrix 

In  much  of  the  equalization  literature  the  effective  noise  correlation  matrix  is  modeled 
as  a  scaled  identity  matrix,  where  the  scaling  is  (approximately)  the  inverse  SNR.  In 
underwater  communication  systems,  the  mean  squared  error  of  the  estimated  data 
symbols  after  DFE  equalization  is  increased  by  using  this  approximation  [75].  The 
physical  cause  of  the  off-diagonal  terms  has  not  previously  been  shown.  In  this  section, 
statistical  analysis  is  provided  which  indicates  that  the  off-diagonal  terms  are  caused 
by  correlated  fluctuations  of  neighboring  channel  impulse  response  coefficients. 

3.4.1  Using  the  Markov  channel  model  for  noise  analysis 

To  obtain  analytical  results,  the  dynamics  of  the  channel  impulse  response  coefficients 
are  assumed  to  follow  a  first-order  Markov  model, 

g[n  +  1]  =  ag[n\  +  v[n  +  1],  (3.4.1) 
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where  a  is  a  scalar  model  parameter  with  |a|  <  1  and  all  vectors  are  N  x  1,  where 
N  =  Na  +  Nc  is  the  channel  length  (as  defined  in  Section  2.2).  The  process  noise 
vector,  v[n],  has  a  correlation  matrix  defined  as 


R„  =  E  {v[n]vH[n]}. 

When  analyzing  the  structure  of  the  effective  noise  correlation  matrix,  two  types 
of  error  can  be  defined:  the  effective  noise,  /j [n] ,  and  the  channel  estimation  error,  -y. 

Recall  that  the  effective  noise,  n[n],  is  the  difference  between  the  actual  received 
signal  and  estimate  of  the  received  signal  using  a  channel  estimate  based  upon  data 
up  to  and  including  time  n  —  1, 

/j,[n]  =  u[n]  —  u[n] 

=  u[n\  —  g  H[n  —  l]d'[n].  (3.4.2) 

In  the  literature,  the  effective  noise  is  also  known  as  the  received  data  prediction  error 
[66]  since  the  channel  model  can  be  thought  of  as  a  prediction  filter.  The  channel 
estimate,  gH[n  —  l],  is  found  using  an  EW-RLS  estimation  algorithm.  The  time  index 
of  the  channel  estimate  is  \n  —  1]  since  only  transmitted  symbols  up  until  time  n  —  1 
is  used  in  the  channel  estimate. 

Using  the  channel  model  from  eq.  (2.2.2), 

u[n\  =  [njd^n]  +  z/[n], 
the  effective  noise  eq.  (3.4.2)  can  be  rewritten  as 

/j,[n]  =  gH[n]d'[n]  +  v[n]  —  g  H[n  —  l]d'[n] 

=  (g [n]  -  g [n  -  l])Hd'[n]  +  v[n] 

—  'yH[n]d'[n]  +  v[n\.  (3.4.3) 
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The  term,  7 [n]  is  the  a  priori  channel  estimation  error, 


7  [n]  =  g  [n]  —  g  [n  —  1] . 


(3.4.4) 


Nadakuditi  and  Preisig  [65,  66,  75]  noted  that  the  channel  coefficients  and  channel 
estimation  error  could  be  modeled  using  an  extended  state  space  model, 


g  M 


al  0 

(a  —  1)1  I  —  k  [n  —  l]d  H[n  —  1] 

u[n\ 

u[n]  —  k  [n  —  1  ]v\n\ 


g  [n  -  1] 

l[n  ~  1] 


(3.4.5) 


The  adaptation  gain  vector,  k[n],  is  defined  as 


-1 


k[n]  =  |  ^  An  *d[i]d^[i]  d[n]  0  <  A  <  1. 


(3.4.6) 


,  i=i 


Using  direct  averaging  methods  [51],  the  channel  estimation  error  correlation  ma¬ 
trix,  R7,  can  be  approximated  as  [65,  66] 


R7  «  E{R7}  =  E{7[n]7H[n]} 

=  x(a,  A)R„  +fi(\)pl. 


(3.4.7) 


In  these  calculations,  the  observation  noise  and  transmitted  data  symbols  are  both 
assumed  to  be  white  with  variance  07;  and  <j%  respectively.  The  symbol  p  is  the  inverse 
SNR,  p  =  cr^/cr^,  as  in  the  previous  section.  The  scaling  parameters  y(a,  A)  and  /3(A) 
are  [66] 


x(ot,  A) 

/3(A) 


(1  -  aA)(l  -  a*)  +  (1  -  a*A)(l  -  a) 
(1  —  |  or  |2)  (1  +  A)(l  —  aA)(l  —  a*A) 
(1-A) 

(1  +  A)- 


(3.4.8) 

(3.4.9) 


The  channel  estimation  error  correlation  matrix  from  eq.  (3.4.7)  is  the  sum  of  two 
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quantities:  x(ct,  A)R„  and  j3(\)pl.  The  first  quantity,  x(a,  A)RV,  is  the  part  of  the 
channel  estimation  error  caused  by  the  time-variability  of  the  channel.  This  error 
is  sometimes  called  lag-error  [31].  The  second  quantity,  f3(\)pl,  is  the  part  of  the 
channel  estimation  error  clue  to  the  observation  noise.  For  a  time-invariant  channel, 
only  the  second  component  is  present. 

Assuming  white  observation  noise,  the  second  quantity  in  the  sum  from  eq.  (3.4.7) 
is  diagonal  so  the  off-diagonal  elements  must  come  from  the  first  quantity  of  the 
sum,  x(a,  A)R„.  Experimentally,  the  channel  estimation  errors  tend  to  dominate  the 
observation  noise  for  the  observed  range  of  SNR,  so  the  first  quantity  in  the  sum  from 
eq.  (3.4.7)  is  dominant  even  when  the  noise  is  not  white.  Since,  x(a,  A)  is  a  scaling,  the 
channel  process  noise  correlation  matrix,  R„  must  contain  diagonal  elements  caused 
by  correlation  between  channel  process  noise  coefficients.  The  off-diagonal  elements 
in  R7  are  caused  by  correlations  in  the  channel  coefficient  process  noise.  In  the  next 
section,  this  observation  is  used  to  show  that  the  off-diagonal  elements  in  the  effective 
noise  correlation  matrix  are  caused  by  correlated  changes  in  the  channel  coefficients. 


3.4.2  Structure  of  effective  noise  correlation  matrix  using 
Markov  channel  update  model 

The  effective  noise  components,  /j [n]  from  eq.  (3.4.2)  can  be  stacked  into  a  vector, 


m  = 


p[n  +  Na 


p[n\ 


p[n  -  Nc  +  1] 


iT 


(3.4.10) 


Recall  that  in  eq.  (3.3.3),  this  effective  noise  vector  is  related  to  the  channel  estimation 
error  and  the  observation  noise  by  the  relation 


n[n\  =  ryy[n]d[n]  +  u[n  . 


For  clarity  in  this  discussion,  the  columns  of  T  [n]  are  labeled  as 


r[n 


7oM  ViK 


7'iv-iW  » 


(3.4.11) 
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where  7'  [n]  is  the  appropriate  channel  estimation  error  vector,  7 Jn],  padded  with 
zeros  so  F [n]  is  a  convolution  matrix  (see  Section  2.2  for  details  of  the  channel  con¬ 
volution  matrix).  The  ( i,j)th  element  of  the  Rr[n]  matrix  is 

[RrH]w)  =  E{7fH7;[n]}.  (3,4,12) 

Assuming  the  process  noise,  v  is  stationary,  the  elements  of  Rr [n]  are  constant, 
i.e.  Rr[  n]  =  Rr-  Furthermore,  the  matrix  Rr  is  Toeplitz  [75],  with  the  elements  of  the 
ith  diagonal  equal  to  the  sum  of  the  elements  of  the  ith  diagonal  of  R7  [75].  Assuming 
the  observation  noise  is  white,  the  off-diagonal  terms  in  the  effective  noise  correlation 
matrix  are  due  to  correlated  fluctuations  in  neighboring  channel  coefficients. 

The  next  section  shows  how  the  assumption  of  a  Toeplitz  matrix  can  be  exploited 
to  create  an  algorithm  which  both  reduces  computational  complexity  and  improves 
performance  over  previously  proposed  algorithms. 


3.5  Estimating  the  effective  noise  correlation  ma¬ 
trix 

Section  3.3  showed  that  the  effective  noise  correlation  matrix  is  the  sum  of  observation 
noise  correlation  matrix  and  a  term  caused  by  channel  estimation  errors.  Section  3.3 
also  showed  that  using  the  assumptions  of  a  slowly  varying  channel  and  stationarity 
of  the  observation  noise  statistics,  the  effective  noise  correlation  matrix  is  Toeplitz. 
The  current  section  provides  an  algorithm  for  estimating  the  entire  effective  noise 
correlation  matrix  by  estimating  first  row  and  using  the  Toeplitz-Hermitian  structure. 

Recall  from  eq.  (3.3.3)  that  the  effective  observation  noise  is  the  difference  between 
the  received  signal  and  the  estimated  received  signal  using  past  transmitted  and  the 
estimated  channel, 

n[n]  =  u[n]  —  G  H[n  —  l]d[n]. 

The  previous  section  showed  that  the  effective  noise  correlation  matrix  is  approx- 
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imately  Toeplitz,  so  the  effective  noise  correlation  matrix  can  be  constructed  using 
only  an  estimate  of  the  first  column.  This  estimate  can  be  computed  using  the  biased 
correlation  of  the  received  signal  prediction  error, 


Lff—i 


V* M  =  -j-  •?] »  i  =  °> 


,  L  —  1, 


(3.5.1) 


3= 1 


where  n[n,j]  is  the  jth  element  of  the  effective  noise  vector  at  time  n,  Lg  =  Lc  + 
is  the  number  of  feedforward  equalizer  coefficients,  and  /I,  is  the  ith  component  of  the 
biased  received  signal  prediction  error  vector  correlation  estimate. 

Assuming  the  effective  noise  is  ergodic  a  time-average  of  Jit  [n]  is  a  good  esti¬ 
mate  of  the  effective-noise  correlation  at  lag  i.  To  accommodate  time-variability  an 
exponentially-windowed  sample  average  is  used  to  approximate  the  ensemble  average, 

A.mH  = ,(! :  (r")  E  aw-  <3-5-2) 

A  AcorrJ  k=Q 

Using  an  exponentially-windowed  sample  average  allows  the  time-varying  nature 
of  the  statistics  to  be  captured  through  Acorr  while  approximating  the  value  of 
the  ith  component  of  the  first  row  of  the  effective  noise  correlation  matrix  at  time 
n.  The  complete  effective  noise  correlation  matrix  is  constructed  using  assuming  a 
Toeplitz-Hermitian  structure  of  the  effective  noise  correlation  matrix. 

Implementing  the  correlation  using  a  fast  Fourier  transform  to  compute  the  auto¬ 
correlation  the  computational  complexity  of  the  proposed  algorithm  is  O  (Lg  log2(Lff)). 
Previously  proposed  algorithms  had  complexity  of  0(L|)  since  there  was  a  necessary 
vector  outer  product.  For  an  underwater  channel  where  the  feedforward  coefficients 
can  number  in  the  tens  or  hundreds,  the  proposed  algorithm  has  a  noticeably  lower 
computational  complexity. 

To  summarize  the  advantages  of  using  the  proposed  algorithm  for  estimating  the 
effective  noise  covariance  matrix  over  previously  proposed  algorithms: 

•  The  number  of  components  that  must  be  tracked  is  reduced  from  Lg2  to  Lg. 

This  helps  with  the  memory  requirements  and  enables  extra  ensemble  averaging 


since  the  diagonals  of  the  correlation  matrix  are  averaged. 


•  The  structure  of  the  interference  plus  noise  correlation  matrix  is  easily  modified, 
i.e.  if  the  effective  noise  correlation  matrix  is  known  to  be  tri-diagonal,  then  only 
two  coefficients  have  to  be  tracked. 

•  There  is  a  performance  improvement  due  to  restricting  matrix  to  be  Toeplitz. 

•  The  computational  complexity  is  reduced  from  0(L|)  to  O  (Lg  log2(I/ff)). 

The  first  point  is  especially  interesting  since  underwater  communication  problems 
are  often  data  limited  due  to  time-variation  of  the  channel.  This  method  provides 
a  way  to  more  effectively  use  the  available  data.  In  the  next  section  the  proposed 
algorithm  and  others  are  compared  using  a  CEB-DFE  on  experimental  data. 


3.6  Experimental  results 

The  SPACE08  was  performed  off  the  coast  of  Martha’s  Vineyard,  MA  from  Oct. 
14th  through  Nov.  1st,  2008.  The  water  depth  was  approximately  15  meters,  the 
transmitter  was  approximately  4  meters  from  the  sea  floor  and  the  bottom  of  the 
receive  arrays  were  about  3.25  meters  above  the  sea  floor.  For  the  data  analyzed  here 
the  distance  from  the  transmitter  to  the  receiver  was  200  meters.  The  receiver  was 
the  twelfth  receive  element  from  the  bottom  of  a  24  element  array  with  5  centimeter 
element  spacing.  Figure  3.6.1  illustrates  the  setup  of  this  experiment. 

The  data  signal  had  a  bandwidth  of  B  —  6.51  kHz  and  was  modulated  onto  a 
carrier  with  frequency  fc  =  12.5  kHz.  The  sampling  frequency  was  fs  =  39062.5 
samples/second.  The  transmitted  signal  analyzed  here  is  a  4095- length  M-sequence 
that  was  repeated  89  times  for  a  packet  that  is  one  minute  in  length  (with  some  zero¬ 
padding).  The  data  was  modulated  using  binary  phase  shift  keying  (BPSK)  onto  a 
square-root  raised  cosine  pulse. 
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3.6.1  Fluctuation  of  effective  SNR 

Using  the  experimental  data  the  first  quantity  examined  is  the  fluctuation  of  the 
estimated  effective  noise  correlation  matrix  coefficients.  Figure  3.6.2  shows  the  top- 
left  estimated  element  of  the  effective  noise  correlation  matrix  as  it  evolves  over  a 
minute  long  data  packet.  This  plot  shows  that  the  effective  noise  statistics  are  time- 
varying  and  need  to  be  tracked.  The  coherence  time  of  these  coefficients  is  apparently 
around  five  seconds,  which  is  very  long  compared  with  the  sampling  period,  so  an 
assumption  of  time-invariance  over  the  averaging  window  is  reasonable. 

The  curve  in  figure  3.6.2  highlights  the  variability  of  the  observed  SNR  over  the 
packet  duration.  A  single  element  of  the  effective  noise  correlation  matrix  changes 
by  more  than  5  dB  over  an  interval  of  less  than  ten  seconds.  Only  using  an  average 
value  would  lead  to  increased  residual  mean-squared-error  after  equalization. 


3.6.2  DFE  comparison 

To  determine  the  effectiveness  of  the  proposed  algorithm  several  methods  for  approx¬ 
imating  the  effective  noise  correlation  matrix  were  examined.  Table  3.6.1  provides  a 
description  of  each  of  the  methods. 

The  mean  squared  error  is  the  squared  magnitude  of  the  residual  data  estimation 
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Time  (s) 

Figure  3.6.2:  Top-left  element  of  the  estimated  effective  noise  correlation  matrix, 
[Rm]  (1  from  October  26,  2008  at  time  0800.  The  variance  is  tracked  using  an 
exponential  window  algorithm..  This  value  is  a  measure  of  the  effective  noise  variance. 
Over  the  one  minute  packet  there  is  a  5  dB  peak  to  peak  change  with  a  coherence 
time  of  approximately  five  seconds. 


Table  3.6.1:  Description  of  methods  compared  using  the  SPACE08  data  set. 


Method  Label 

Description 

AMB 

CEB-DFE  where  the  effective  noise  correlation  matrix  is 
approximated  as  a  scaled  identity  matrix  where  its  scaling 
is  based  on  the  SNR  measured  from  the  basebanded  data 
before  any  equalization. 

D1AG 

CEB-DFE  where  the  top-right  entry  of  the  estimated  effec¬ 
tive  noise  correlation  matrix,  is  used  as  an  esti¬ 

mate  of  the  effective  noise  variance.  The  effective  noise  cor¬ 
relation  matrix  is  approximated  as  a  scaled  identity  matrix. 

SING 

CEB-DFE  where  the  effective  noise  correlation  matrix  is 
approximated  by  a  diagonal  matrix  with  entries  equal  to 
the  main  diagonal  of  RM . 

FULL 

CEB-DFE  where  the  full  effective  noise  correlation  matrix 
is  estimated  from  the  data. 

TOEP 

CEB-DFE  where  the  effective  noise  correlation  matrix  is 
estimated  as  a  Topelitz-Hermitian  matrix  as  described  in 
this  chapter. 

71 


(a)  Data  from  October  23,  2008  at  time  1800.  (b)  Data  from  October  20,  2008  at  time  1200. 


(c)  Data  from  October  26,  2008  at  time  0800. 


Figure  3.6.3:  Mean  squared  error  (MSE)  results  after  DFE  for  SPACE08  experiment 
200m  data  using  different  estimates  of  the  effective  noise  correlation  matrix  defined 
in  Table  3.6.1.  Data  is  ordered  from  (a)  calm  conditions  to  (c)  stormy  conditions. 


error  before  any  symbol  decisions  are  made: 

(3.6. 

where  M  is  the  number  of  transmitted  symbols.  In  these  results,  M  =  100,  000. 
The  in-band  SNR  is  varied  by  adding  an  appropriately  scaled  realization  of  the  am¬ 
bient  noise  which  was  recorded  using  the  same  hydrophone  shortly  after  the  signal 
transmission  ended. 


m-dw 


^MSE  — 
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Figure  3.6.3  shows  the  mean  squared  error  from  three  different  days  of  the  SPACE08 
experiment.  These  figures  are  ordered  from  the  calmest  ocean  conditions  on  the  top 
to  the  roughest  conditions  at  the  bottom.  On  October  23  (Julian  day  297)  the  wave 
height  was  0.5  meters,  1.2  meters  on  October  20  (Julian  day  294),  and  3  meters  on 
October  26  ((Julian  day  300). 

The  data  shows  that  there  is  a  penalty  for  assuming  that  the  estimated  noise 
matrix  is  diagonal  (labeled  DIAG  in  the  plot).  The  data  also  show  that  there  is  no 
additional  penalty  for  estimating  this  diagonal  matrix  using  only  one  estimated  value 
and  a  Toeplitz  structure  (labeled  SING  in  this  plot). 

The  plot  labeled  A  MB  is  created  using  an  equalizer  which  estimated  the  effective 
noise  correlation  matrix  as  the  diagonal  matrix  with  the  diagonal  equal  to  the  inverse 
of  an  SNR  measured  from  the  received  signal.  This  approximation  does  not  account 
for  channel  estimation  errors  so  as  the  SNR  is  increased  there  is  a  model  mismatch 
between  the  estimated  and  the  true  effective  noise  correlation  matrix,  so  above  a 
threshold  the  MSE  will  increase  with  SNR  (as  observed). 

The  results  show  that  a  DFE  using  an  effective  noise  correlation  matrix  calcu¬ 
lated  using  the  proposed  method  ( TOEP )  slightly  outperforms  one  using  an  matrix 
calculated  with  no  Toeplitz  constraint  ( FULL ).  This  data  emphasizes  the  overall  gain 
since  there  is  a  slight  performance  improvement  and  there  is  decrease  in  the  amount 
of  computation  needed. 

The  improvement  in  performance  is  the  result  of  the  reduction  in  the  number  of 
free  parameters  that  must  be  estimated  when  the  Toeplitz  assumption  is  imposed. 

The  results  also  demonstrate  that  assuming  that  the  effective  noise  correlation 
matrix  is  diagonal  and  therefore  not  accounting  for  the  full  correlation  structure  of 
this  noise  results  in  a  significant  performance  loss. 

Therefore,  the  proposed  method  imposes  appropriate  structure  on  the  estimated 
effective  noise  correlation  matrix  to  both  improve  performance  with  respect  to  other 
proposed  or  commonly  used  approaches  and  to  reduce  computation  complexity  when 
compared  to  the  next  best  performing  method. 
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3.7  Conclusions 


In  this  chapter,  the  existence  of  non-zero  off-diagonal  elements  in  the  effective  noise 
correlation  matrix  has  been  justified  based  upon  the  physical  characteristics  of  the 
underwater  acoustic  propagation  environment;  the  fluctuations  of  neighboring  taps  of 
the  channel  impulse  response  are  correlated  which  causes  off-diagonal  terms  appear 
in  the  effective  noise  correlation  matrix. 

An  algorithm  exploiting  the  Toeplitz  and  Hermitian  structure  of  this  matrix  was 
developed  that  not  only  reduces  computational  complexity  when  compared  to  algo¬ 
rithms  that  do  not  impose  the  Toeplitz  constraint  but  also  results  in  a  DFE  whose 
performance  is  better  than  that  achieved  by  DFEs  using  other  estimators  of  the  ma¬ 
trix.  The  reduction  in  computational  complexity  is  important  in  array  systems  where 
the  number  of  coefficients  being  estimated  is  quite  large  and  efficient  algorithms  are 
needed  for  practical  implementation. 

Experimental  data  indicated  that  the  statistics  of  the  effective  noise  need  to  be 
tracked  to  prevent  loss  of  system  performance.  The  variance  of  the  effective  noise  can 
vary  by  as  much  as  5  dB  in  a  minute-long  packet. 

The  literature  on  equalization  of  RF  channels  uses  a  non-adaptive,  diagonal  es¬ 
timate  of  the  effective  noise  correlation  matrix.  Methods  described  in  this  chapter 
could  be  applied  to  equalizers  for  RF  channels  to  reduce  the  mean  squared  error  of 
the  equalized  symbols. 
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Chapter  4 


Physically  constrained  beamspace 
processing  for  a  multichannel  DFE 

4.1  Introduction 

Multichannel  decision  feedback  equalization  that  has  one  feedforward  section  for  each 
sensor  and  one  feedback  section  for  the  whole  system  has  been  shown  to  be  a  nearly 
optimal  method  for  handling  the  difficulties  of  the  underwater  communication  channel 
[105].  When  there  are  a  large  number  of  array  elements,  however,  the  implementation 
with  one  feedforward  section  per  sensor  may  be  prohibitive  due  to  the  rate  of  channel 
variability  verses  the  degrees  of  freedom  and  high  computational  complexity.  Channel 
time-variability  limits  the  time-interval  over  which  the  constant  channel  assumption 
is  reasonable  thus  limiting  the  averaging  interval  of  adaptive  algorithms.  The  compu¬ 
tational  complexity  is  proportional  to  the  square  of  the  number  of  channels.  Both  of 
these  problems  can  be  mitigated  through  beamspace  processing,  where  the  number 
of  DFE  feedforward  sections  is  now  the  number  of  beams. 

Beamforming  is  a  spatial  filtering  technique;  only  energy  from  certain  directions 
is  passed  through.  Using  a  narrowband  assumption,  the  received  signal  including  all 
multipath  components  arrives  in  a  restricted  angular  space  clue  to  channel  physics, 
beams  can  be  used  to  pass  a  restricted  angular  space,  therefore  reducing  the  problem 
dimensionality,  without  reducing  the  available  degrees  of  freedom  of  the  received 
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signal.  Processing  in  beamspace  both  reduces  the  amount  of  data  needed  for  averaging 
and  reduces  the  computational  complexity. 

Stojanovic  et  al.  [109]  showed  that  when  the  angle  of  arrival  (AoA)  is  known  for 
all  paths  through  the  environment,  the  noise  is  spatially  and  temporally  white,  and 
the  signal  is  narrowband  the  MMSE  solution  for  the  beamformer  weights  is  a  matrix 
with  each  column  equal  to  an  array  manifold  vector,  one  for  each  path.  In  the  same 
paper,  Stojanovic  proposed  an  adaptive  method  to  find  the  beamformer  weights  since 
the  AoA  are  often  unknown.  This  method  is  observed  to  work  well  [109],  but  does 
not  significantly  reduce  the  computational  complexity. 

In  the  current  work  a  non-adaptive  method  is  proposed  which  finds  the  optimal 
set  of  beams  for  a  given  range  for  the  AoA  on  each  path.  This  provides  robustness 
of  the  beamformer  to  arrival  direction  and  allows  the  beams  to  be  computed  off-line. 
Using  an  MMSE  criterion,  the  optimal  beamformer  weights  for  a  range  of  AoA  on  a 
line  array  are  shown  to  be  the  Discrete  Prolate  Spheroidal  Sequences  (DPSS).  Slepian 
is  attributed  with  discovering  the  DPSS  [94],  These  sequences  have  a  number  of  nice 
properties,  such  as  mutual  orthogonality,  symmetry  and  real-value  coefficients.  Many 
methods  have  been  studied  for  finding  the  DPSS  [73]. 

In  this  thesis,  a  vertical  line-array  receiver  is  assumed.  This  choice  is  made  for 
a  number  of  reasons:  first,  the  physics  of  the  acoustic  channel  will  naturally  bound 
the  angle  of  arrival.  Second,  when  using  a  narrow-band  assumption  on  a  line-array 
the  array  manifold  vectors  are  complex  exponential  functions  which  simplifies  the 
derivations.  Third,  the  available  experimental  data  uses  a  vertical  line-array  receiver. 

The  proposed  method  in  this  chapter  to  bound  the  AoA  range  uses  a  geometric 
ray-path  model  of  sound  propagation.  The  ray-path  model  along  with  the  assumption 
of  a  Pekeris  waveguide  [42],  provides  the  AoA  span  and  number  of  arrivals  within  a 
given  delay  spread.  When  the  number  of  arrivals  in  the  delay  span  is  less  than  the 
number  of  sensors,  there  are  fewer  feedforward  sections  in  the  multichannel  DFE  for  a 
beamspace  where  there  is  one  section  for  each  arrival  than  in  sensor  space  where  there 
is  one  section  for  each  sensor.  The  ray-path  model  parameters  such  as  water  column 
depth,  propagation  distance,  and  transmitter  and  receiver  geometry  are  often  either 
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known  a-priori  or  can  be  measured  in-situ  using  commonly  available  instruments  and 
techniques. 

The  proposed  methods  for  determining  the  beamformer  weights  and  the  number 
of  beams  are  verified  using  experimental  data  from  the  Surface  Processes  and  Acous¬ 
tic  Communication  Experiment  (SPACE08)  performed  in  at  the  Martha’s  Vineyard 
Coastal  Observatory  in  2008. 


4.2  Receiver  structure 

The  receiver  structure  studied  throughout  this  chapter  consists  of  a  wideband  beam- 
former  followed  by  a  multichannel  DFE,  which  is  an  extension  of  the  DFE  introduced 
in  Section  2.4.  This  structure  allows  for  flexibility  in  the  design  of  both  the  beam- 
former  and  the  DFE.  The  beamformer  reduces  the  signal-space  dimensionality  from 
the  number  of  sensors  down  to  the  number  of  beams.  The  multichannel  DFE  equal¬ 
izes,  coherently  combines,  and  estimates  the  transmitted  symbol. 

4.2.1  Multichannel  decision  feedback  equalization 

Recall  that  the  decision  feedback  equalizer  (DFE)  consists  of  two  linear  filters  working 
together:  the  feedforward  filter  collects  the  energy  from  the  received  signal  and  shapes 
its  response  and  the  feedback  filter  cancels  the  inter-symbol  interference  (ISI)  from 
previously  received  symbols  [61,  81].  The  general  DFE  equation  is 

Lc  1  -^fb 

d[n\  —  hfi[£]u[n  —  £]  +  h^[£]d[n  —  l],  (4-2-1) 

t=-La  i=  1 

where  u[n\  is  the  baseband  received  data,  d[n]  is  the  past  symbol  decisions,  and  d[n] 
is  the  filtered  received  data  before  a  symbol  decision  has  been  made.  The  La  +  Lc 
feedforward  filter  coefficients  are  represented  as  hs[n],  where  La  is  the  number  of 
acausal  coefficients  and  Lc  is  the  number  of  causal  coefficients.  The  feedback 
coefficients  as  hfb [n\.  The  total  number  of  DFE  coefficients  is  L  =  La  +  Lc  +  L^. 
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Figure  4.2.1:  A  multichannel  decision  feedback  equalizer. 


Using  vector  notation,  the  DFE  equation  can  be  written  more  compactly, 

d[n]  =  hgu[n]  +  h^d[n]  =  hHz[n],  (4.2.2) 

where  u [n]  is  a  vector  of  received  signal  samples  and  d[n]  is  a  vector  of  past  data 
symbol  estimates.  Both  of  these  vectors  are  defined  more  carefully  in  Section  2.4. 
Two  vectors  are  reintroduced  for  notational  simplicity:  h7  =  [hg  h^J  is  a  vector 
of  hlter  coefficients  and  zT [n]  =  [uT[n]  d7  [n]]  is  a  data  vector  containing  both  the 
received  data  and  the  past  symbol  estimates. 

This  framework  can  be  modified  to  accommodate  multiple  receivers  by  expanding 
the  definition  of  the  hlter  and  data  vectors.  When  there  are  K  receive  elements,  the 
vectors  h  and  z  [n]  are 


where  u,  [n]  is  the  vector  of  data  received  at  the  ith  receive  element  and  is  the 
vector  of  feedforward  hlter  coefficients  for  the  ith  receive  element.  See  Figure  4.2.1  for 
an  illustration  of  the  functionality  of  a  multichannel  DFE. 

A  fractionally  sampled  equalizer  is  often  used  to  reduce  synchronization  errors 
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[82],  Using  the  supplied  framework,  the  feedforward  filters  will  each  have  samples 
per  received  symbol,  while  the  feedback  filter  will  remain  the  same  length.  At  each 
iteration,  the  data  fed  into  each  channel  of  the  feedforward  equalizer  are  moved  ahead 
by  Tfs  samples. 

Using  data  until  time  n,  the  LSE  solution  for  the  DA-DFE  filter  coefficients  is 

hM  =  (  it  (  it  •  (4-2-3) 

\i=— oo  /  \i=— oo  / 


Notice  that  the  filter  coefficients  now  explicitly  depend  on  time  due  to  the  dependence 
on  the  received  data.  Using  an  EW-RLS  algorithm,  the  DA-DFE  filter  coefficients 
are  computing  using  the  relation 


-l 


h[n]  =  (  ^  An  lz[i]zH\i\ 


An-*z \i\d* 


(4.2.4) 


The  DA-DFE  structure  is  used  in  this  chapter  because  it  has  low  complexity 
compared  with  the  CEB-DFE  (also  known  as  the  MMSE  DFE).  Since  the  CEB-DFE 
algorithm  requires  an  inversion  of  an  Lx  L  matrix,  the  complexity  is  0(L3).  The  DA- 
DFE  algorithm  uses  a  data-recursive  update  to  find  a  new  solution,  so  the  complexity 
is  only  0(L2).  A  second  reason  the  DA-DFE  is  used  is  that  the  performance  difference 
between  the  DA-DFE  and  CEB-DFE  is  negligible  when  the  SNR  is  moderate  to  low, 
which  is  where  many  underwater  communication  systems  operate.  There  is  an  in- 
depth  comparison  of  the  DA  and  CEB  DFE  in  Chapter  6. 


4.2.2  Beamforming 

In  Eq.  (4.2.4),  the  number  of  filter  coefficients  being  estimated  is  K  x  (La  +  Lc)  +  L^. 
In  underwater  acoustic  communication  systems,  a  common  practice  is  to  set  the  num¬ 
ber  of  feedforward  equalizer  coefficients  based  on  the  delay  spread  of  the  significant 
received  signal  energy.  Using  this  criterion  the  use  of  tens  of  coefficients  per  feedfor¬ 
ward  section  is  common  resulting  in  high  computational  complexity.  Stojanovic  et 
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Figure  4.2.2:  A  multichannel  decision  feedback  equalizer  with  a  beamformer  front-end 
to  reduce  computational  complexity. 

al.  [109]  noted  that,  when  the  signal  is  narrowband,  the  observation  noise  is  spatially 
and  temporally  white  (or  whitened),  and  the  number  and  direction  of  all  arrivals  is 
known,  using  beamformed  data  is  equivalent  to  full  sensor-space  processing. 

Beamforming  can  be  viewed  as  a  mapping  of  the  received  signal  from  the  physical 
sensor  space  to  beam  space.  This  is  accomplished  by  applying  a  spatial  window¬ 
ing  function  with  desired  spatial-spectral  characteristics.  Even  though  underwater 
acoustic  communication  data  is  not  narrowband,  the  following  wideband  beamform¬ 
ing  method  can  be  used  with  a  linear  array:  a  discrete  Fourier  transform  (DFT)  is 
applied  first  to  the  data,  beamforming  is  applied  separately  to  each  frequency  bin, 
and  the  inverse  DFT  is  applied  to  the  result.  The  beamforming  operation  can  be 
represented  by  the  expression 

ubf(ca)  =  <f>H(u;)u(u;),  (4.2.5) 


where  <f>(a;)  is  the  K  x  P  beamforming  matrix  at  frequency  to.  The  notation  u(cn) 
represents  the  Fourier  transform  of  the  received  data  and  v(cu)  represents  the  beam- 
formed  data,  both  at  at  frequency  to  .  The  elements  of  the  vectors  are 


ui(io) 

u2{u) 

uk(uj)]t 


«bf,iM 

ubfM  = 

wbf  ,p(co)}T 
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where  Ukioj)  is  the  received  data  from  sensor  k  and  Ufjf:P(uj)  is  the  received  data  in 
beam.  There  are  many  good  references  covering  beamforming  more  completely  such 
as  [37,  43,  121]. 

The  input  to  the  DA-DFE  is  the  output  of  a  wideband  beamformer,  Ubf(u;),  trans¬ 
formed  into  the  time  domain.  Since  the  number  of  beams,  P,  is  often  much  less  than 
the  number  of  sensors,  K,  this  results  in  a  system  with  algorithmic  complexity  of 
0((K/P)2)  which  is  much  less  than  the  sensor-space  system,  which  has  complexity 
0(K 2).  Figure  4.2.2  shows  the  DA-DFE  with  a  beamformer. 


4.3  Geometric  ray-tracing  propagation  model 

Beamforming  is  a  useful  method  because  it  reduces  computational  complexity  and 
potentially  increases  performance.  After  deciding  to  use  a  beamformer,  the  next  ques¬ 
tions  a  system  designer  might  ask  are  “How  many  beams  should  one  use?”  and  “What 
beam  weights  are  best?”  A  common  idea  in  the  beamforming  literature  is  to  use  an 
algorithm  to  track  each  arrival  angle  separately  and  create  a  set  of  beams  which  are 
the  array  manifold  vectors  pointed  in  the  estimated  arrival  directions  [7,  121,  124], 
These  methods  tend  to  be  computationally  complex  since  the  angles  of  arrival  are 
time- varying.  Stojanovic  et  al.  [105]  noted  that  when  designing  an  acoustic  communi¬ 
cation  system,  the  beamformer  does  not  need  to  separate  arrivals  into  separate  beams 
since  the  feedforward  equalizer  adaptive  combines  the  arrivals  together  by  combining 
the  beams. 

Since  channel  motion  induces  changes  in  the  arrival  angles,  the  approach  proposed 
in  this  chapter  is  to  use  a  geometric  model  of  the  arrival  structure  to  calculate  a 
minimum  and  maximum  arrival  angle  and  use  a  set  of  beams  which  span  that  angular 
range. 

Ray  tracing  is  a  common  method  used  for  high  frequency  acoustics  (above  1  kHz) 
[42],  In  the  ray  tracing  model  considered  in  this  chapter,  the  channel  is  assumed 
to  be  a  Pekaris  waveguide  with  an  pressure  release  surface  and  a  soft,  flat  bottom. 
At  a  boundary  the  angle  of  incidence  equals  the  angle  of  reflection,  so  a  geometric 
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Table  4.3.1:  Table  of  elevation  arrival  angles  and  delays  for  first  five  arrivals  using 
geometric  model  with  a  flat  surface  and  flat  bottom. 


Path 

Direct 
Bottom 
Surface 
Surface-Bottom 
Bottom- Surface 


Arrival  Angle  (in  radians) 

f  +  arctan 
f  +  arctan 
f  -  arctan  ) 

f  +  arctan  (2d™~^+c*T) 

§  -  arctan 


Delay  (in  seconds) 

y/(  dR-dt)i+£2TB 

Cw 

yj  {2dw-dR-dT)2+t^Yl 
Cw 

yj  (dji+dx)2 R 

Cw 

yj  (2  dw-dR+dT)2+(Z,n 
Cw 

-\J  (2dw  -{-dj1— 

Cw 


propagation  model  only  depends  on  the  water  column  depth,  the  speed  of  sound, 
the  depth  of  the  transmitter  and  receiver,  and  the  distance  between  the  transmitter; 
parameters  that  are  readily  available  in  many  oceanographic  applications.  In  the 
Pekeris  waveguide,  there  is  a  soft  bottom  so  paths  which  have  propagation  angles 
above  some  critical  angle  are  lost.  In  this  work  there  is  no  need  to  know  what  the 
critical  angle  is  only  that  there  are  a  limited  number  of  paths. 

Table  4.3.1  contains  the  delay  and  elevation  angle  of  arrival  for  the  earliest  arriving 
paths,  using  the  notation 

dy,  water  column  depth  [m] 
d,T  transmitter  depth  [m] 

dn  receiver  depth  [m] 

£tr  distance  from  transmitter  to  receiver  [m] 
cw  speed  of  sound  in  seawater  [m/s] 

A  ray-path  model  with  a  finite  number  of  paths  is  approximately  accurate  since 
there  are  only  a  small  number  of  paths  propagating  below  the  critical  angle.  The 
arrival  angles  are  bounded  within  the  range  of  the  propagating  paths.  Figure  4.3.1 
illustrates  this  bounded  arrival  structure  for  the  case  of  three  propagating  paths:  the 
direct  path,  the  bottom  bounce  path,  and  the  surface  bounce  path. 

The  equations  for  computing  the  angle  of  arrival  can  be  extended  to  an  arbitrary 
Tpath.  For  simplification  the  last  surface  the  signal  interacts  with  before  the  receiver 
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Figure  4.3.1:  Illustration  of  multipath  and  the  physically  constrained  angle  of  arrival 
for  the  shallow  water  communications  channel.  The  angle  of  arrival  of  a  path  from 
the  surface  is  defined  to  as  9  =  0°,  from  the  bottom  9  =  180°,  and  broadside  9  =  90°. 


is  specified.  When  the  last  bounce  is  a  surface  bounce,  the  angle  of  arrival  is 


9 


path, surface  —  arCSill 


l 


TR 


Cw  ■ 


(4.3.1) 


When  the  last  bounce  is  a  bottom  bounce,  the  angle  of  arrival  is 


a  _  11  ,  f  ljR 

C'path, bottom  „  T  arCCOS  I 

"  \  Cw  *  Tpath 


(4.3.2) 


Using  an  delay,  rreiative,  relative  to  the  shortest  path  (he.  when  the  propagation 
path  is  of  length  #tr),  the  equations  for  angle  of  arrival  can  be  rewritten.  When  the 
last  bounce  is  a  surface  bounce,  the  AoA  expression  becomes 


/i  ii  Ac  '  ^relative  .  , 

^path, surface  arCSill  j  (  —  h  1 


(4.3.3) 


When  the  last  bounce  is  a  bottom  bounce,  the  AoA  expression  becomes 


n  7T  It  Cw  •  Treiative 

fc'path, bottom  =  77  +  arccos  - - - h  1 

*  \  \  <-TR 


-1' 


(4.3.4) 


Figure  4.3.2  shows  the  estimated  delay  and  angle  of  arrival  for  signals  propagating 
along  each  path  using  the  ray  model  (the  white  crosses)  as  well  as  estimates  of  the 
actual  intensity  estimated  from  the  data  as  a  function  of  delay  and  angle  using  data 
collected  during  the  SPACE08  experiment  (described  in  Section  4.7). 
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Angle  of  Arrival  (0  ) 


Figure  4.3.2:  Estimated  angle  of  arrival  and  delay  of  the  channel  impulse  response 
arrivals  from  the  from  the  SPACE08  experiment  from  Julian  day  297  at  time  1800. 
The  white  crosses  indicate  the  arrival  points  calculated  from  the  geometrical  arrival 
model.  The  arrivals  are  labeled  according  to  their  interaction  with  the  surface  and 
bottom  from  the  transmitter  to  the  receiver:  S  indicates  a  surface  bounce  and  B  a 
bottom  bounce. 
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4.4  Optimal  beams  for  bounded  angle-of-arrival 
subspace 

4.4.1  Channel  model 


The  previous  section  provided  a  limited  method  for  estimating  the  number  of  beams 
impinging  on  a  receiving  array,  but  a  natural  question  arises,  “How  is  the  number 
of  arrivals  related  to  the  number  of  beams  needed  to  capture  the  signal  energy?” 
Stojanovic,  et  al.  [109]  showed  that  the  number  of  beams  should  equal  the  number 
of  arrivals.  To  show  this,  a  framework  for  analyzing  the  problem  is  described.  The 
vector  of  signals  traveling  along  P  paths,  received  by  a  linear  array  with  K  elements 
can  be  modeled  as 


r  -i 

1 

1 

r  -i 

p  -| 

u0[n\ 

= 

e-i<b 

yi  N 

+ 

vx[n\ 

uk-i[n\ 

e-j(K- !)</>!  .  . 

,  .  e-j(K-l)<j)p 

yp[n] 

vp[n] 

=  Sy[n]  +  v[n\.  (4-4.1) 


In  the  expression  above,  v[n\  is  a  vector  of  noise  components  assumed  to  be  indepen¬ 
dent  of  the  signal  and  $  is  a  matrix  whose  columns  are  the  array  manifold  vectors 
pointed  in  the  arrival  direction  characterized  by  (f>k  =  —  ds  x  Uk ■  The  direction  Uk  is 
defined  as  Uk  =  cos (9k)  and  6k  is  the  AoA  of  the  kth  path,  /  is  the  signal  frequency 
being  examined,  cw  is  the  speed  of  sound  in  water  (assuming  isovelocity  channel).  ds 
is  the  sensor  spacing.  The  array  manifold  vector  for  a  uniformly  spaced  linear  array 


is  [121] 


1 


v(ttfc)  = 


o~3 


2  nf 


dsuk 


( K-l)dsUk 


(4.4.2) 


The  vector  y [n\  is  called  the  path  space  signal  since  it  is  the  transmitted  data 
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symbols  convolved  with  the  impulse  response  along  each  path  (not  including  phase 
shift  terms  in  <b), 

y  [n]  =  Gp[n]d'[n].  (4.4.3) 

The  overall  path  channel  impulse  response  matrix,  Gp[n] ,  includes  transmitter  and 
receiver  hltering.  Dehning  Na  as  the  maximum  number  of  acausal  channel  coefficients 
across  all  paths  and  Nc  as  the  maximum  number  of  causal  channel  coefficients  across 
all  paths,  the  ith  column  of  Gp [n]  is  the  effective  channel  of  the  ith  path  at  time  n, 

g M  =  \ 9i[n ,  Nc  —  1]  ...  gt[n,  0]  ...  g^n,  -Aa]]T,  (4.4.4) 

where  gi[m,£]  is  the  channel  impulse  response  coefficient  for  the  ith  path  at  time  m 
and  delay  £. 

The  transmitted  symbol  vector,  d^n],  (deffiied  similarly  to  eq.  (2.2.4))  is 

d'fn]  =  [d[n  —  Nc  +  1]  ...  d[n]  ...  d[n  +  Na]]T,  (4.4.5) 

where  d[m]  is  the  transmitted  data  symbol  at  time  m.  A  common  simplifying  assump¬ 
tion  to  use  spatially  and  temporally  white  observation  noise  v  [n] .  This  assumption 
is  most  accurate  when  the  SNR  is  very  high  and  instrumentation  noise  dominates 
the  environmental  ambient  noise.  Stojanovic  et  al.  [109]  showed  that  a  multichannel 
DFE  with  a  beamformer  which  used  $  as  the  beamforming  weight  matrix  operated 
with  minimum  mean  squared  error,  i.e. 

ubf[n]  =  $Hu[n],  (4.4.6) 

where  ubf[n]  is  the  beamformed  data.  Furthermore,  when  the  signal  is  narrowband 
and  the  noise  is  spatially  and  temporally  white,  any  beamforming  matrix  B  that 
satisfies 

(BhB )_1  Bh$  =  (4.4.7) 


can  be  used  as  the  beamformer  weight  matrix  to  achieve  minimum  mean  squared 


error  performance.  The  matrix  B(BiiB)Bw  is  a  projection  matrix,  so  eq.  (4.4.7) 
implies  that  any  matrix  that  describes  a  subspace  which  contains  4>  does  not  increase 
the  mean  squared  error  performance  of  the  multichannel  DFE. 

One  example  of  a  matrix  which  satisfies  the  condition  of  eq.  (4.4.7)  is  the  identity 
matrix,  I,  where  the  beamspace  is  the  sensor-space.  Reducing  the  number  of  beams, 
however,  reduces  the  computational  complexity  and  also  improves  system  perfor¬ 
mance  by  reducing  the  number  of  adapting  parameters.  This  reasoning  strongly 
suggest  that  the  number  of  beams  should  be  minimized.  The  minimum  number  of 
beams  implied  by  eq.  (4.4.7)  is  the  number  of  arrival  paths. 

4.4.2  Derivation  of  optimal  beams 

In  most  underwater  settings  the  arrival  angles  for  the  different  multipath  components 
are  unknown,  so  the  matrix  4>  is  also  unknown.  Adaptively  tracking  beam  weights 
which  minimize  the  mean  squared  error  performance  is  one  way  to  circumvent  this 
issue  [109].  Unfortunately,  this  adds  additional  computational  complexity  and  the 
adaptation  method  described  in  [109]  is  observed  to  be  unstable  under  certain  condi¬ 
tions  (shown  experimentally  in  Section  4.7). 

An  alternate  method  proposed  in  the  current  work  is  to  create  a  non-adaptive 
set  of  beams  based  on  observed  environmental  parameters.  In  the  shallow  water 
(Pekeris)  waveguide  there  are  only  a  finite  number  of  propagating  paths  under  our 
model  assumptions,  so  there  are  a  finite  number  of  arrivals  [42],  The  arrival  angles 
for  the  multipath  components  are  bounded  to  umin  <  u  <  itmax ,  where  u  =  cos(0). 
Without  loss  of  generality  the  angle  of  arrival  range  is  assume  to  be  centered,  so 
— Mmin  =  "Umax  =  uq.  Any  non-centered  range  can  be  centered  by  introducing  a  phase 
shift  in  the  beam  weights. 

The  condition  in  eq.  (4.4.7)  cannot  be  met  with  equality  for  a  continuous  range 
of  angles.  The  problem  is  equivalent  to  a  discrete  filter  design  problem  with  a  unity 
constraint  over  an  angular  range.  Neither  of  these  problems  can  be  solved  exactly  in  a 
finite  dimensional  space.  Each  column  of  4>  is  an  array  manifold  vector  parametrized 
by  an  angle  of  arrival,  6.  A  new  metric  is  proposed  which  minimizes  average  distortion 
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of  the  inner  product  of  two  array  manifold  vectors  over  the  angular  range  of  interest. 
The  beamformer  weight  matrix,  Bopt,  is  found  as  a  solution  to  the  optimization 
problem, 


fU0 


CU0 


Bopt  =  nun 


Ui  jvfu2 


rH, 


Ur 


)B(BflB) 


1Bhv/(u/2)  du\du'2.  (4.4.8) 


r  —ILQ  J  —UQ 


The  form  of  this  problem  is  similar  to  one  recently  studied  by  Kitchens  in  chapter  5  of 
his  PhD  thesis  [48].  The  solution  from  Kitchens  proposed  was  to  build  a  matrix  of  the 
desired  rank  where  the  columns  are  the  eigenvectors  corresponding  to  the  maximum 
eigenvalues  of  the  matrix 


f  U0 


R, 


V (U  V 


,H , 


u)du. 


’  —u  0 


(4.4.9) 


A  key  part  of  the  proof  in  [48]  uses  the  Poincare  separation  theorem  to  show  that 
the  eigenvectors  corresponding  to  the  maximum  eigenvalues  are  the  solution  to  this 
problem.  The  details  of  the  proof  imply  that  a  whole  family  of  metrics  give  rise  to  this 
same  solution:  any  metric  that  preserves  the  eigenvalue  ordering  is  equivalent.  One 
result  is  that  hireling  the  subspace  which  minimizes  the  distortion  of  the  inner  product 
is  equivalent  to  hireling  the  subspace  that  minimizes  the  average  squared  difference 
between  the  array  manifold  vectors  and  their  projection.  Other  cost  functions  also 
produce  the  same  solutions,  and  hence  different  interpretations,  but  these  are  not 
explored  further  in  this  thesis. 


Using  array  manifold  vectors  for  linear  array  (see  eq.  (4.4.2)),  the  form  of  this 
solution  is  the  same  (within  scaling)  to  a  problem  studied  by  Slepian  in  [94],  In 
this  work  he  showed  that  the  eigenvectors  of  the  matrix  in  eq.  (4.4.9)  are  the  set 
of  discrete  prolate  spheroidal  sequences  (DPSS).  Thus,  a  solution  to  eq.  (4.4.8)  is  a 
matrix  where  the  columns  are  the  first  P  discrete  prolate  spheroidal  sequences,  where 
P  is  the  number  of  paths. 


4.4.3  Discrete  prolate  spheroidal  sequences 

The  discrete  prolate  spheroidal  sequences  were  discovered  by  Slepian  in  1970s  [94] 
while  trying  to  answer  the  question:  “What  is  the  fixed-length,  real  sequence  with  the 
most  energy  within  a  specified  bandwidth?”  The  DPSS  are  real  and  have  a  number 
of  surprising  and  useful  symmetry  properties  [94],  Since  their  discovery  DPSS  have 
been  used  in  many  areas  of  signal  processing,  most  notably  for  multi-taper  spectral 
estimation  [73,  114]. 

The  set  of  DPSS  specified  of  a  given  length,  TVdpss,  and  target  normalized  band¬ 
width,  W,  form  a  complete  orthogonal  basis.  Note  that  only  about  2WDPgsbP  se¬ 
quences  will  have  a  majority  of  their  energy  within  the  specified  bandwidth  [94]. 
Figure  4.4.1  shows  an  example  of  the  first  5  DPSS  weights  for  IVdpss  =  24  and 
W  =  0.12.  Note  that  the  energy  of  the  fifth  beam  is  starting  to  leak  outside  of  the 
specified  bandwidth. 

DPSS  are  sometimes  avoided  is  that  there  is  no  closed  form  solution  for  finding 
the  DPSS  values.  Fortunately,  efficient  methods  for  finding  the  sequence  values  are 
quite  prevalent  in  the  literature,  e.g.  [8,  73,  94],  In  the  current  work,  the  DPSS  beam 
weights  are  found  for  a  specified  bandwidth,  which  corresponds  to  the  angle  of  arrival 
range.  The  number  of  DPSS  beams  can  range  from  one  to  the  number  of  sensors. 
The  DPSS  beam  weights  are  orthogonal  since  the  set  of  DPSS  are  orthogonal  by 
definition.  If  the  number  of  beams  desired  is  equal  to  the  number  of  sensors,  the 
beamformer  weight  matrix  is  unitary. 

4.5  Alternative  beamforming  strategies 

In  the  previous  section  the  optimal  beam-weights  were  found  to  be  the  DPSS  when 
the  angle  of  arrivals  are  unknown  but  bounded  to  a  specified  range.  The  algorithm 
for  determining  the  DPSS  can  have  high  computational  complexity,  the  beam-weights 
are  symmetric,  and  they  are  only  optimal  for  the  specified  criterion.  In  this  section, 
a  variety  of  alternative  beamforming  strategies  are  presented  for  comparison  with  the 
DPSS  beams-weights. 


DPSS  Coefficients 


Normalized  Frequency 


Figure  4.4.1:  First  5  DPSS  coefficients  for  the  DPSS  with  24  coefficients  constrained 
within  a  normalized  bandwidth  bounded  by  ±0.12. 
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Figure  4.5.1:  Uniform  weighted  beampattern  in  k  space  at  the  design  frequency  of 
the  array.  Note  that  the  peak  of  beam  is  at  the  null  of  the  adjacent  beams. 


The  methods  for  determining  the  beam-weights  in  this  section  are  chosen  either  be¬ 
cause  of  their  relative  simplicity  (steered  uniform  beams),  they  were  well-established 
in  the  literature  (MVDR,  MCM,  principle-component,  and  fully  adaptive  methods), 
or  they  exploited  some  property  that  potentially  improved  performance  while  reduc¬ 
ing  computational  complexity  (time-aligned  beams  and  hybrid  methods). 


4.5.1  Uniform  beamformer 

One  alternative  set  of  beam  weights  is  uniformly  weighted  beams.  As  the  name 
implies,  the  beam  weight  coefficients  all  have  the  same  magnitude.  The  beam  weights 
are  the  array  manifold  vectors  with  angles  specified  so  that  neighboring  beams  are 
orthogonal  at  the  design  frequency,  usually  specified  as  fd  =  ff-  for  an  array  with 
sensor  spacing  of  ds.  The  first  beam  placed  is  placed  at  broadside  which  ensures 
that  the  beams  are  placed  symmetrically.  Figure  4.5.1  shows  an  example  of  adjacent 
beams  at  the  fd. 
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(a)  (b) 

Figure  4.5.2:  Arrivals  from  SPACE08  data  during  calm  weather  conditions.  The 
orange  line  specifies  the  delay  calculated  for  the  time-aligned  uniform  beam  weights. 
Figure  (a)  shows  the  bounds  for  a6  =  1  and  figure  (b)  with  cp,  = 

4.5.2  Uniform  beamformer  time-aligned 

The  geometric  ray-path  model  devised  in  Section  4.3  indicates  that  the  arrival  delay 
depends  on  the  angle  of  arrival.  The  method  for  finding  the  uniform  beams  described 
above  produces  a  specific  set  of  angles  where  the  beams  are  steered.  Using  the 
expressions  from  Section  4.3,  the  path  delays  can  be  computed  for  the  specified  beam 
angles. 

Recall  that  for  white  observation  noise,  the  feedforward  section  of  the  DFE  is 
a  matched  filter.  Using  the  estimated  the  path  delay  for  the  beam  angle,  only  the 
feedforward  coefficients  corresponding  to  delays  after  the  arrival  are  included  in  the 
feedforward  section;  the  feedforward  coefficients  corresponding  to  regions  where  no 
signal  energy  is  expected  are  discarded.  This  reduces  the  dimensionality  of  the  adap¬ 
tive  equalizer  which  reduces  the  computational  complexity. 

Using  the  geometric  ray-path  model,  the  estimated  arrival  delay,  rm  for  a  specified 
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arrival  angle,  9,  is 


£2 

Tm(0)  =ab—  CSC 2 (9)  -  Toffeet, 
C 


(4.5.1) 


The  parameters  ab  and  r0ffset  are  design  parameters  introduced  to  account  for  inaccu¬ 
racies  of  the  ray-path  model  and  decrease  sensitivity  of  the  beams  to  channel  motion. 
The  expression  esc (9)  is  the  cosecant  of  9  (i.e.  the  reciprocal  of  the  sine  of  9.)  The 
parameter  ab  is  a  scaling  factor  used  to  widen  the  bound  and  the  parameter  roffset  is 
an  delay  offset.  Here  these  parameters  are  set  to  ab  =  1/2  and  r0ffset  =  10  *  Ns/ fs, 
where  Ns  is  the  number  of  samples  per  symbol  and  fs  is  the  sampling  frequency. 
Figure  4.5.2  illustrates  two  values  of  the  scaling  parameter  ab  =  1  and  ab  =  |. 


4.5.3  Principle  component  (eigenvector)  beamforming 


A  desirable  property  of  the  beamformed  received  signal  is  that  the  beams  are  mutually 
uncorrelated.  LeBlanc  and  Beaujean  [55,  56]  proposed  achieving  this  using  principle 
component  analysis  (PCA)  on  the  estimated  received  signal  correlation  matrix,  Ru. 
To  see  that  this  is  the  correct  method  consider  the  objective: 


E{Ubf,i[n]'“bf,j[n]}  =  ttPCA  V,-- 


(4.5.2) 


In  this  expression  um,i  is  the  received  signal  beamformed  with  beam  i,  ovca  is  a 
scaling,  and  StJ  is  the  Kronecker  delta 


1  i=  j 
0  ij£j 


Using  the  expression,  uw,i  =  wpca  ?:u[nL  and  evaluating  the  expectation  in  eq.  (4.5.2)) 
gives 

wPCA,iRuWPCA,j  =  CtpcA  $i,j-  (4.5.3) 
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One  solution  for  Wpca,  is  the  eigenvectors  of  Ru.  The  received  data  correlation  matrix 
is  not  available  and  so  it  is  estimated  using 

N 

RuN  =  ub1u^b1-  (4-5-4) 

j=n-N-\-l 

The  scaling  parameter  ctpcA  becomes  the  eigenvalue  associated  with  the  ith  eigenvec¬ 
tor. 

LeBlanc  and  Beaujean  focused  mainly  on  decorrelating  the  data.  This  method 
not  only  decorrelates  the  beamformed  data  but  by  using  the  largest  eigenvalue  first 
captures  the  most  signal  energy  for  a  specified  number  of  beams. 

In  this  thesis  the  eigenvectors  corresponding  to  a  specified  number  of  the  largest 
principle  components  (eigenvalues)  of  the  estimated  received  signal  correlation  matrix 
are  used  as  beamforming  weights. 


4.5.4  MVDR  and  MCM  beamforming 

The  techniques  discussed  up  to  this  point  have  not  taken  the  spatial  spectrum  of  the 
noise  into  account.  For  many  underwater  channels,  the  noise  wavenumber  spectrum 
is  colored  due  to  the  nature  of  underwater  noise  [119,  42],  When  the  noise  correlation 
matrix  is  known  or  well-estimated,  the  minimum  variance  distortionless  response 
(MVDR),  which  is  similar  to  the  Capon  beamformer  [13],  is  a  common  structure  used 
in  the  beamforming  community.  This  beamformer  provides  the  minimum  variance 
response  such  that  signals  arriving  from  the  specified  angle  of  arrival  are  not  distorted. 
The  constrained  optimization  problem  from  finding  the  MVDR  beam  weights  is 


wmvdr  =  arg  min  Eljw^s)2} 

W 

subject  to  wHv(ui)  =  1. 


(4.5.5) 


In  this  formulation,  s  =  v(«i)  +  v  is  the  signal  vector,  wmvdr  is  beamformer  weight 
vector,  v(ti!)  is  the  array  manifold  vector  pointing  at  direction  u\  =  cosdi  (6 1  is 
the  AoA),  v  is  a  noise  vector,  and  wnv(ui)  =  1  is  the  distortionless  criterion.  The 
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solution  to  this  optimization  problem  is  wMVdr,  [121] 


wmvdr  =  [v(ui)RI/1v(«i)]  1Riy1v('Ui),  (4.5.6) 

where  R.w  is  the  noise  correlation  matrix. 


One  shortcoming  of  the  MVDR  framework  is  that  it  is  sensitive  to  model  mis¬ 
match.  For  example  when  the  signal  is  treated  as  noise,  it  is  rejected  when  the  true 
signal  is 

s  =  v(hi)  +  v, 

and  the  difference  between  the  two  angles  is  A u\  =  \u\  —  U\\  >  (assuming  white 
noise).  This  difficulty  can  be  handled  when  using  multiple  MVDR  beams  by  ensuring 
the  difference  is  not  too  big  between  two  neighboring  specified  distortionless  response 
angles,  i.e.  for  two  MVDR  beams  with  specified  angles  U\  and  u2, 


| U2  -  Ml|  < 


Kdsf 


Another  difficulty  of  using  MVDR  beams  occurs  when  the  noise  is  highly  direc¬ 
tional;  regardless  of  SNR,  signals  arriving  near  strong  noise  directions  are  highly 
attenuated.  The  purpose  of  a  beamformer  in  a  communication  system  is  to  collect  as 
much  signal  energy  as  possible  to  increase  the  observed  SNR.  Creating  a  beamformer 
which  potentially  rejects  signal  energy  is  non-ideal. 

One  method  to  partially  mitigate  this  shortcoming  is  to  use  the  multiple  constraint 
method  (MCM)  proposed  by  Schmidt  et  al.  [89].  The  MCM  imposes  additional  con¬ 
straints  to  ensure  that  directions  near  the  distortionless  direction  are  not  attenuated 
too  heavily.  In  the  present  case,  the  additional  constraints  ensure  notches  don’t  ap¬ 
pear  in  the  main  lobe  of  the  beamformer. 

The  MVDR  optimization  problem  from  eq.  (4.5.5)  withadditional  constraints  is 
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the  MCM  constrained  optimization  problem 


Wmcm  =  arg  min 

W 

E{  wfls  2} 

subject  to 

wMCMv(«l)  =  1 

WMCMVM  =  b2 

(4.5.7) 

wMCMv(uJVmc)  =  t>Nmc- 

In  the  above  expressions,  6,;  is  a  constraint  and  Nmc  is  the  number  of  constraints. 
The  first  constraint  is  the  distortionless  constraint  in  the  desired  look  direction,  u y , 
so  by  =  1.  One  method  for  setting  the  additional  constraints,  borrowed  from  [89], 
is  to  set  by  =  vH (ui)v(uj),  the  inner  product  of  the  distortionless  direction  with  the 
constraint  direction.  Setting  all  bt  —  1  uses  too  many  degrees  of  freedom  and  limits 
the  noise  rejection  capability  outside  of  the  constraint  directions.  The  constraint 
directions  are  usually  chosen  to  be  within  the  main-lobe  response  of  the  distortionless 
direction. 


Building  a  vector  b  of  constraint  values  and  a  matrix  V,  where  each  column  is  an 
array  manifold  matrix  pointed  in  a  constraint  direction,  i.e. 


b  = 


by 

b2 


V  = 


vOl)  v(«2)  v(ujvmc) 


the  constraints  can  be  written  more  compactly  as 


WMCMV  =  b 


(4.5.8) 


The  solution  for  the  beamformer  weights  using  MCM  is  [89] 


wMcm  =  r;1v[vX1v]  1  b 


(4.5.9) 


96 


To  create  a  set  of  beams,  a  set  of  directions  are  chosen  (e.g.  evenly  spaced  angles 
over  the  range  of  interest)  and  MCM  beam  weights  are  calculated  for  each  beam 
direction.  The  distortionless  directions  chosen  for  the  set  of  MCM  beams  (one  dis¬ 
tortionless  direction  per  beam)  are  the  same  as  the  steered  directions  of  the  uniform 
beams  described  in  the  previous  section.  The  reason  for  this  choice  is  that  when  the 
noise  is  white,  the  MCM  beamformer  weight  matrix  equals  the  uniform  beamformer 
weight  matrix. 

This  procedure  can  be  extended  by  enforcing  mutual  orthogonality  among  the 
beams.  The  multichannel  DFE  with  orthogonal  MCM  beams,  however,  had  higher 
observed  output  mean  squared  error  than  a  multichanel  DFE  with  non-orthogonal 
MCM  beams.  This  result  appeared  in  every  data  set  tested,  so  the  additional  effort 
of  constraining  the  beams  to  be  orthogonal  appears  to  not  be  worthwhile. 

Both  the  MVDR  or  MCM  beamforming  weight  are  non-ideal  for  communication 
systems  because  the  goal  of  both  MVDR  and  MCM  methods  is  to  reject  energy  from 
certain  directions.  The  main  purpose  of  the  beamformer  in  a  communication  system 
is  to  gather  as  much  signal  energy  as  possible  and  preserve  as  many  degrees  of  freedom 
for  future  adaptation  by  the  DFE.  Thus,  a  multichannel  DFE  using  either  MVDR  or 
MCM  beam  weights  won’t  perform  as  well  as  other  proposed  methods. 

4.5.5  Adaptive  time-domain  beamforming 

One  method  to  avoid  the  complications  of  the  narrowband  assumption  is  to  work 
entirely  in  the  time  domain.  Initial  work  on  an  adaptive  array  with  tapped  delay  line 
processing  was  done  by  Compton  et  al.  [22,  23,  84,  122,  123].  Recently,  Stojanovic 
et  al.  [109]  proposed  a  time-domain,  adaptive  beamformer  with  a  multichannel  DFE. 
The  beamformer  and  the  DFE  are  adapted  together  using  the  same  error  to  update 
their  coefficients  using  an  RLS  algorithm. 

The  procedure  for  Ending  the  multichannel  DFE  coefficients  when  using  a  beam- 
former  is  very  similar  to  the  procedure  for  finding  the  multichannel  DFE  coefficients 
without  a  beamformer.  The  beamformed  and  equalized  data  before  a  symbol  decision 
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is  made  are 


d[n] 


hff,i  [n  -  1] 

H 

U7  [n\  c*  [n  —  1] 

hff,2[n  -  1] 

U7  [n\c*2 [n  —  1] 

+  h^[n  -  l]d[n  - 

hff ,M[n  -  1] 

UT[n]c*M[n  -  1] 

(4.5.10) 


In  this  expression,  U  is  an  Lg  x  K  matrix  of  the  received  signal  where  each  column 
corresponds  to  one  sensor  (Lg  =  La  +  Lc  is  the  number  of  feedforward  coefficients 
in  each  feedforward  section  of  the  DFE),  c *  is  a  vector  the  ith  beam  weights,  hg^ 
is  a  vector  of  the  ith  feedforward  section  coefficients,  hfb  is  the  feedback  coefficients, 
and  dfb  is  a  vector  of  past  data  estimates.  The  time  index  on  all  estimated  vectors 
indicates  the  time  at  which  the  estimate  was  made  (all  are  a  lag-1  estimates).  The 
time  index  on  the  feedback  data  indicates  that  the  most  recent  estimate  is  the  last 
piece  of  data. 


A  more  compact  representation  is 


d[n\  =  hff  [n  —  l]qc [n]  +  h^[n  —  l]d[n  —  1], 


h  r 


(4.5.11) 


where,  hg,  is  a  vector  of  all  the  feedforward  DFE  coefficients  and 


qc  = 


XJT[n]cl[n  —  1] 
UT[n]c^[n  -  1] 


U7  [n]c*M  [n  —  1] 


An  even  more  compact  representation  is  given  by 


d[n]  =  h77  [n]x[n], 


(4.5.12) 


where 


and 


h  [n 


hff  [n] 
hfb  [n\ 


x[n 


q  c[n] 

dfb[n  -  1] 


(4.5.13) 


(4.5.14) 


The  equalizer  coefficients  are  found  as  the  solution  to  the  exponentially  windowed 
least  squares  optimization  problem, 


h[n]  =  argmin 

h' 


n 

E- 

i— 1 


d[i\  —  h'x[i 


(4.5.15) 


where  d[i]  is  the  transmitted  data  symbol  estimate  at  time  i  and  A  is  the  exponential 
weighting  coefficient.  The  solution  to  this  optimization  problem  is 


h[n  = 


n 

E 

.  i=0 


A^xMx^hi 


(4.5.16) 


A  key  observation  for  hireling  the  beamforming  coefficients  is  that  in  eq.  (4.5.10), 
the  beamformer  and  the  feedforward  filter  are  interchangeable.  Thus,  the  data  esti¬ 
mate  can  be  rewritten  as 


r  i 

H 

p  1 

Ci  [n  -  1] 

U[n]h*>-1] 

d[n]  = 

c2[n  -  1] 

\J[n]h*S  2[n  -  1] 

cM[n  -  1] 

U[n]h£iM[n-  1]_ 

+  h^[n  -  l]dfb[n  -  1].  (4.5.17) 


Eq.  (4.5.17)  can  be  rewritten  as 

d[n]  =  c [n  -  l]q h[n]  +  h(j [[n  -  l]d[n  -  1], 


(4.5.18) 
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where  c[n]  a  vector  of  the  adaptive  beam  weights  stacked  together  and 

U[nK,i[n-l] 

U[n]h^2[n  -  1] 

U[n]hgjM[n-  1] 

The  beamforming  coefficients  found  by  solving  the  exponentially  windowed  least- 
squares  problem 

d[i]  -  h fo[i\d[i  -  1]  -  c,Hq h[i\  .  (4.5.19) 

The  solution  to  this  optimization  problem  is 

c[n]  =  An“,q/1b']q"[f]j  fj]  An-?qfe[f]  -  h£[i]d[i  -  1])  V  (4.5.20) 

Comparing  eqs.  (4.5.16)  and  (4.5.20)  reveals  that  the  error  term  being  minimized  is 
the  same,  which  allows  for  a  parallel  implementation. 

The  adaptive  beamforming  algorithm  tends  to  work  well  in  practice,  but  the  al¬ 
gorithmic  stability  is  hard  to  analyze  due  to  nonlinearities.  Instabilities  have  been 
observed  in  implementation,  even  when  used  in  training  mode  where  actual  transmit¬ 
ted  symbols  are  used  instead  of  symbol  decisions. 

This  use  of  the  same  error  metric  for  adaptation  of  both  the  beams  and  the 
equalizer  coefficients  could  lead  to  a  variety  of  failure  modes.  If  one  of  the  beams 
has  a  very  low  weight  or  two  of  the  beams  are  highly  correlated,  the  inverse  matrix 
is  ill-conditioned  .  The  reverse  situation  also  occurs  when  one  feedforward  section 
of  the  DFE  is  all  low  weights.  These  situations  occur  when  the  problem  is  over- 
parametrized,  i.e.  more  beams  are  used  than  paths  and  there  are  a  limited  number 
of  snapshots.  Figure  4.5.3  shows  the  interconnectedness  of  the  adaptive  algorithm. 
Since  there  is  so  much  interconnection  and  the  use  of  shared  quantities  is  non-linear, 
instabilities  could  easily  occur  when  using  the  algorithm. 


q  h[n  = 
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Figure  4.5.3:  Interconnections  in  adaptive  beamformer-DFE  algorithm,  which  could 
lead  to  possible  instabilities. 

4.5.6  Hybrid  methods 

To  harness  both  the  computation  reduction  of  the  non-adaptive  beams  with  the  per¬ 
formance  of  the  adaptive  methods  a  hybrid  method  is  proposed.  In  this  method  the 
received  signal  is  sent  through  an  initial  beamformer  which  reduces  the  dimensional¬ 
ity  of  the  data  from  sensor  space  to  a  beamspace.  This  beamformed  data  is  fed  into 
an  adaptive  beamformer  algorithm  and  then  into  the  multichannel  DFE. 

The  main  idea  behind  this  method  is  that  using  a  non-adaptive  beamforming 
method,  such  as  DPSS,  preserves  more  of  the  signal  energy  and  has  similar  computa¬ 
tional  advantages  to  reducing  dimensionality  by  ignoring  some  of  the  receive  sensors. 
Additionally,  there  are  fewer  parameters  for  the  adaptive  beamformer  to  adjust  (due 
to  the  dimension  reduction),  which  could  improve  performance  in  highly  time- varying 
environments.  A  block  diagram  of  this  method  is  shown  in  figure  4.5.4.  The  beam- 
formed  data  is  represented  as  u'[n]  after  the  initial  beamformer,  which  has  B  >  P 
beams. 


4.6  Estimating  the  number  of  beams 

In  the  previous  sections,  many  methods  for  finding  beampatterns  for  a  given  number 
of  beams  were  presented.  In  this  thesis  there  has  been  no  strong  guidance  yet  into 
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Figure  4.5.4:  Hybrid-method  for  combining  non-adaptive  beamformer  with  adaptive 
beamformer.  First  the  data  is  beamformed  using  a  non-adaptive  beamformer,  such 
as  a  set  of  DPSS  beams  or  uniformly  weighted  beams,  and  then  the  signal  is  sent 
through  an  equalizer  and  beamformer  which  are  allowed  to  adapt  to  the  signal. 


how  to  choose  the  number  of  beams  to  use,  except  that  the  number  of  beams  should 
be  equal  to  the  number  of  paths. 

In  this  section,  several  methods  are  presented  for  estimating  the  number  of  ar¬ 
rivals  impinging  on  the  array.  These  methods  fall  into  three  broad  classes:  (i)  physics- 
based  methods,  (ii)  information  theoretic  methods,  and  (iii)  generalized  y2  methods. 
Physics-based  methods  use  the  ray-path  model  presented  in  Section  4.3  and  environ¬ 
mental  parameters  to  estimate  the  number  of  arrivals.  Information  theoretic  methods 
use  eigenvalue  analysis  to  create  an  estimate  of  the  rank  of  the  signal  subspace.  Gen¬ 
eralized  y2  methods  assume  the  channel  is  Rayleigh  fading  and  match  the  estimated 
received  signal  statistics  to  a  generalized  y2  distribution  to  determine  the  number  of 
degrees  of  freedom  and  hence  the  number  of  arrival  paths. 

The  aim  of  this  section  is  to  evaluate  these  methods  and  describe  their  relative 
strengths  and  weaknesses.  The  results  show  that  generalized  y2  methods  produce  an 
estimate  of  the  number  of  beams  which  nearly  matches  the  knee  in  the  observed  mul¬ 
tichannel  DFE  performance  curves.  Physics-based  methods  produce  a  slightly  higher 
estimate,  but  one  that  is  still  reasonable.  Information  theoretic  methods  produce  a 
estimate  much  higher  than  the  observed  data  seems  to  support. 
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4.6.1  Physics-based  method 


Section  4.3  described  a  geometric  ray-path  model  of  the  acoustic  propagation  envi¬ 
ronment.  Along  with  the  assumption  of  a  constant,  known  sound  speed  this  model 
provides  not  only  the  angle  bounds  discussed  earlier  but  also  the  number  of  propaga¬ 
tion  paths.  Figure  4.3.1  shows  an  illustration  of  the  propagation  paths. 

The  number  of  equalizer  taps  in  each  feedforward  section  is  assumed  to  either  be 
specified  or  determined  from  the  channel  impulse  response  estimate.  Generally,  the 
number  of  taps  is  chosen  such  that  there  are  enough  to  span  most  of  the  energetic 
portion  of  the  channel  impulse  response.  If  the  specific  channel  impulse  response  is 
not  available,  the  number  of  taps  is  usually  specified  for  a  range  of  expected  channels 
and  the  available  computing  resources. 

At  first  glance  this  may  appear  to  be  a  chicken  and  egg  problem  because  one  might 
expect  that  while  examining  the  channel  impulse  response  the  number  of  paths  should 
be  clear.  The  energetic  region  of  a  channel,  however,  is  much  easier  to  estimate  than 
to  determine  the  number  of  arrival  paths.  Figure  4.6.1  shows  the  evolution  in  time 
of  a  channel  impulse  response  estimate  when  the  communication  distance  was  1  km. 
The  delay  spread  with  significant  energy  is  approximately  10  ms.  The  number  of 
significant  multipath  components  is  not  obvious  from  the  channel  impulse  response 
estimate. 

The  delay  spread  of  the  channel  with  significant  energy  could  be  found  by  inserting 
a  sequence  with  good  correlation  properties  ( e.g .  an  M-sequence)  into  the  commu¬ 
nications  packet  and  correlating  the  received  signal  with  the  same  sequence.  This 
algorithm  could  be  efficiently  implemented  and  the  delay  spread  can  be  determined 
automatically.  The  author  knows  of  no  similarly  simple  techniques  to  determine  the 
number  of  significant  arrivals. 

Once  the  number  of  coefficients  per  feedforward  section  is  known,  the  ray-path 
model  is  used  to  compute  the  number  of  arrival  paths  which  fall  within  a  specified 
delay  extent  {i.e.  the  number  of  feedforward  equalizer  coefficients).  Recall  that  Table 
4.3.1  shows  the  expressions  for  the  first  five  arrivals. 
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Figure  4.6.1:  Estimate  of  a  time-varying  channel  impulse  response.  The  data  is  from 
the  SPACE08  experiment,  with  a  1000  m  propagation  distance  from  transmitter  to 
receiver  and  a  RLS  channel  estimator  is  used.  The  color  scale  indicates  the  magnitude 
of  the  channel  estimate  in  time  and  delay.  This  figure  illustrates  that  the  channel 
delay  spread  is  simple  to  approximate,  but  the  number  of  multipath  arrivals  is  not 
apparent. 


4.6.2  Information  theoretic  methods 

Description  of  estimator 

Using  information  theory  a  method  can  be  derived  for  determining  the  number  of  ar¬ 
rivals  directly  from  the  data.  The  methods  explored  here  assume  that  the  plane-wave 
propagation  model  is  valid  (narrowband  assumption).  The  key  observation  of  these 
methods  is  that  when  the  received  signal  correlation  matrix,  Ru  =  Ejufnju^fn]},  and 
the  observation  noise  correlation  matrix,  =  Fi{u[n]uH [n]},  are  both  known  the 
whitened  correlation  matrix,  R“  1  Ru  has  P  eigenvalues  greater  than  one  and  K  —  P 
eigenvalues  equal  to  one,  where  P  is  the  number  of  paths  and  K  is  the  number  of 
receive  sensors.  An  eigenvalue  decomposition  of  the  whitened  correlation  matrix  ex- 
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actly  estimates  the  number  of  paths  which  is  equal  to  the  appropriate  number  of 
beams  [109]. 

Unfortunately,  neither  the  observation  noise  correlation  matrix  nor  the  received 
signal  correlation  matrix  is  known  and  both  must  be  estimated  from  the  data.  The 
estimated  received  signal  correlation  matrix  and  the  estimated  whitened  received 
signal  correlation  matrix  both  have  K  unique  eigenvalues  in  general  [121].  Therefore, 
an  eigenvalue  decomposition  of  the  estimated  whitened  received  signal  correlation 
matrix  does  not  reveal  the  number  of  components  directly.  Using  hypothesis  testing, 
however,  the  number  of  multipath  components  can  be  estimated  from  the  eigenvalues 
of  the  estimated  whitened  received  signal  correlation  matrix  [121]. 

The  received  signal  correlation  matrix  can  be  decomposed  as 

Ru  =  th  T-  Rt,  (4.6.1) 

where 

^  (4.6.2) 

$  is  the  full  column  rank  array  manifold  matrix  and  Ry  [n]  =  E{y[n]yH[n]}  is  the 
path  space  signal  correlation  matrix,  as  defined  in  eq.  (4.4.3).  If  the  transmitted  data 
symbols  are  unit  energy  and  white,  then 

Ry  H  =  G>]GP[«]. 

Assuming  that  Ry  is  non-singular,  the  rank  of  th  is  equivalently  equal  to  the  number 
of  paths  P ,  the  rank  of  Ry,  and  the  number  of  columns  in  VR.  If  'T  is  K  x  K ,  then 
the  K  —  P  eigenvectors  of  'T  are  zero. 

If  the  observation  noise  is  white,  then  R„  has  the  form 

R„  =  o-'l  I.  (4.6.3) 

If  the  noise  is  not  white,  but  the  noise  correlation  matrix,  R^  is  either  known 
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a-priori  or  can  be  estimated,  then  the  received  data  can  be  whitened  to  produce  an 
equivalent  whitened  problem 

u  =  R;1/2u,  (4.6.4) 

and 

Ra  =  R;1/2^R;1/2  +  I.  (4.6.5) 

In  acoustic  underwater  communication  problems  neither  the  noise  correlation  ma¬ 
trix,  R„  nor  the  signal  plus  noise  correlation  matrix  Ru  is  known  a-priori  and  both 
must  be  estimated  from  the  data, 


Ru[n] 

1 

'  N 

a 

u[m]uH[m] 

(4.6.6) 

m=n-N-\-l 

R^  [n] 

1 

““  N 

n 

u„[m]u”[m]. 

(4.6.7) 

m=n—  N+l 


In  eq.  (4.6.7),  u*,  are  noise  only  samples,  which  can  be  taken  during  quiet  periods 
between  packets.  Eqs.  (4.6.6)  and  (4.6.7)  are  the  maximum  likelihood  estimates  of 
the  received  signal  correlation  matrix  and  the  observation  noise  correlation  matrix, 
respectively,  when  both  the  observation  noise  and  the  data  are  described  by  zero-mean 
Gaussian  distributions. 

In  this  section,  the  problem  of  interest  is  to  estimate  the  number  of  significant 
multipath  components  P  from  the  estimated  received  signal  correlation  matrix,  Ru. 
Assuming  the  noise  is  white  with  unit  variance  (the  noise  can  be  whitened  as  above) 
this  problem  is  equivalent  to  determining  how  many  of  the  eigenvalues  of  R„  are 
statistically  greater  than  one,  given  only  the  noisy  estimate  Ru. 

A  collection  of  techniques  which  solve  this  problem  have  their  roots  in  information 
theory  [121],  The  two  techniques  explicitly  discussed  are  the  Akaike  Information 
Criterion  (AIC)  [2,  3]  and  the  Bayesian  Information  Criterion  (BIC)  [83,  90].  There 
has  also  been  a  flurry  of  activity  lately  by  Nadakuditi  and  others  solving  this  same 
subspace  rank  estimation  problem  using  random  matrix  theory  [67,  69,  68],  but  these 
results  are  not  discussed  explicitly  here. 
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Both  the  AIC  and  the  BIC  are  problems  that  seek  to  solve  a  cost  function 
parametrized  by  a  positive  integer  r, 

J(r)  =  -21ogpu(u|£(,))  +  7r(r),  (4.6.8) 

^ ( t )  ^ (r) 

where  logpu(u|^  )  is  the  log- likelihood  of  u  for  estimated  parameters  ^  and  n(r) 
is  a  penalty  term  related  to  the  number  of  degrees  of  freedom  in  the  model. 

The  parameter  vector,  ^r\  contains:  the  maximum  likelihood  estimate  of  the  ob¬ 
servation  noise  variance,  (5^)mi,  the  maximum  likelihood  estimate  of  the  r  largest 
eigenvalues,  (A i)mi  i  =  and  the  corresponding  eigenvectors,  i  = 

1 , ,  r  of  the  received  signal  correlation  matrix,  Ru.  If  the  received  data  are  drawn 
from  a  Gaussian  distribution  (he.  the  observation  noise  and  the  signal  are  Gaussian), 
the  eigenvalues  and  eigenvectors  of  the  sample  received  signal  correlation  matrix  are 
the  maximum  likelihood  estimates,  he.  [4] 


(4.6.9) 

m= 1 

i)ml  i  1?  •  •  •  ?  T 

(4.6.10) 

(3  i)mi  =  Pi  i  =  l,...,r 

(4.6.11) 

{Vw)ml  ~  ^  _  r  ^ 1 

(4.6.12) 

i=r+ 1 


where  the  subscript  mi  indicates  the  maximum  likelihood  estimate  of  the  true  pararn- 

^(r) 

eter.  The  maximum  likelihood  estimates  of  the  elements  of  ^  are 

f']=  a?,  Ai,...,Ar,3f,...,3^  (4.6.13) 

The  total  number  of  degrees  of  freedom  in  the  parameter  vector  £  is  the  num¬ 
ber  of  parameters  that  can  be  freely  changed  where  real  parameters  have  one  degree 
of  freedom  and  complex  parameters  have  two.  Without  constraints,  the  number  of 
degrees  of  freedom  in  the  parameter  vector  is  1  from  the  observed  noise  variance,  r 
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from  the  real  valued  eigenvalues,  and  2 rK  from  the  complex  eigenvectors,  for  a  total 
of  1  +  r  +  2 rK.  The  eigenvectors  are  constrained  to  be  unit  norm,  which  reduces 
the  available  degrees  of  freedom  by  2 r,  and  mutually  orthogonal  which  removes  an¬ 
other  2 |r(r  —  1).  Therefore,  the  total  number  of  available  degrees  of  freedom  in  the 

r ) 

parameter  vector  ^  is 

DoF(£^)  =  r  +  1  +  2  rK  —  2  r  —  r(r  —  1)  =  r(2K  —  r)  +  1 


The  difference  between  the  AIC  and  the  BIC  is  the  form  of  the  penalty  function. 
The  penalty  function  for  the  AIC  is  the  available  number  of  degrees  of  freedom  in 

^(r) 

the  parameter  vector  ^  .  The  penalty  function  for  the  BIC  is  the  available  number 

^(r) 

of  degrees  of  freedom  in  ^  scaled  by  half  the  log  of  the  number  of  snapshots.  The 
additional  scaling  term  on  the  BIC  penalty  function  leads  to  lower  estimates  of  the 
number  of  observed  paths  than  the  AIC  (on  average)  [90]. 

The  parametrized  log-likelihood  function  can  be  written  as  [125,  121] 


Lr(r) 


logpu(u|^('})  =  N(K  -  r )  log 


i  T  f 

K-r  Z— ,i=r+ 1  I 


(4.6.14) 


where  N  is  the  number  of  snapshots  available.  The  log  of  the  ratio  of  the  arithmetic 
to  the  geometric  means  of  a  noisy  data  set  is  a  measure  of  the  additional  information 
gained  by  the  knowledge  that  true  data  are  all  equal  to  the  arithmetic  mean  of 
the  observed  data  [127],  i.e.  how  “surprising”  a  discovery  would  be  that  the  true 
data  are  all  equal  to  the  arithmetic  mean.  This  interpretation  of  the  parametrized 
log-likelihood  fits  intuitively  well  for  the  AIC  and  BIC  measures  which  attempt  to 
determine  the  number  of  equal  eigenvalues  of  the  received  signal  correlation  matrix. 

Using  the  expressions  for  both  the  log-likelihood  ratio  and  the  penalty  function, 
the  BIC  and  AIC  can  be  written  as 


AIC(r)  =  2 Lr(r)  +  2(r(2K  -  r)) 

BIC(r)  =  2 Lr(r)  +  (r(2K  -  r)  +  1)  log  At 
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(4.6.15) 

(4.6.16) 


The  estimate  of  the  number  of  observed  paths  (the  rank  of  the  signal  subspace)  is 


PAic  =  argminAIC(r)  (4.6.17) 

r 

Pbic  —  argminBIC(r)  (4.6.18) 

r 

The  estimate  produced  using  the  BIC  is  consistent,  i.e.  as  the  number  of  snapshots 
approaches  oo,  the  estimate  converges  to  the  true  value.  The  estimate  produced  using 
the  AIC  is  not  consistent  [125],  but  with  finite  amounts  of  data,  the  AIC  tends  to 
give  a  better  estimate  of  the  signal  subspace  rank  than  the  BIC  [121]. 


Effectiveness  of  information  theoretic  methods  for  determining  number  of 
observed  paths 


One  pitfall  of  these  information  theoretic  methods  is  that  the  signal-subspace  di¬ 
mension  estimate  tends  to  be  higher  than  the  number  of  paths  when  the  channel 
is  time-varying.  These  information  theoretic  methods  use  an  assumption  that  the 
matrix  $  is  constant  over  the  averaging  window.  For  a  rapidly  varying  underwater 
channel,  this  may  not  be  the  case. 

When  the  channel  is  varying  in  time,  the  estimated  dimension  can  be  greater 
than  the  number  of  paths.  To  illustrate  this  point,  consider  a  unit  energy  signal  that 
changes  direction  halfway  through  an  averaging  window,  i.  e. 


v(90)  n  < 
v(6*i)  n  >  ^p, 


(4.6.19) 


In  the  above  expressions,  v(d,;)  is  the  array  manifold  vector  parametrized  by  the  angle 
of  arrival  and  Nw-m  is  the  window  length.  The  time-averaged  signal  will  have  an 
estimated  signal  subspace  dimension  of  2,  even  though  at  any  time  instant  there  is 
only  one  path  present. 

This  hazard  is  intrinsic  to  the  framework  of  these  estimators  and  the  only  work 
around  is  to  shorten  the  data  averaging  time.  Random-matrix  methods  attempt  to 
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accomplish  this  by  minimizing  the  amount  of  data  needed  to  estimate  the  statistics 
(see  e.g.  [69]).  For  a  constantly  varying  channel,  however,  there  is  no  window  short 
enough  so  the  channel  appears  entirely  stationary. 

Another  issue  with  reducing  the  averaging  window  is  the  validity  of  the  nar¬ 
rowband  assumption.  One  way  to  enforce  the  narrowband  assumption  when  using 
wideband  data  is  to  first  take  the  discrete  Fourier  transform  (DFT)  of  the  received 
signal  and  then  process  each  resultant  frequency  bin  individually,  but  the  available 
data  is  reduced  proportionally  to  the  DFT  length.  Note  that  the  AIC  and  BIG  are 
correctly  estimating  the  observed  subspace  signal  dimension.  Unfortunately,  in  the 
case  of  a  time-varying  channel,  this  dimension  is  not  necessarily  equal  to  the  number 
of  observed  paths  (Paic.bic  A  P)-  Using  these  methods  may  lead  one  to  use  more 
beams  than  are  strictly  necessary,  reducing  the  benefits  of  beamspace  processing. 

4.6.3  Generalized  x2  testing 

Generalized  y2  random  variables 

An  alternative  statistical  method  for  determining  the  number  of  beams  is  from  the 
atmospheric  science  community  [11]  and  is  based  on  statistical  analysis  of  y2  random 
variables.  A  y2  random  variable  is  the  sum  of  the  squares  of  independent  and  iden¬ 
tically  distributed  Gaussian  random  variables  with  zero  mean  and  unit  variance  [70] . 
Changing  the  random  variables  in  the  sum  to  circularly-symmetric  complex  Gaussian 
random  variables  with  variance  a2  is  a  generalization  of  the  y2  random  variable  and  is 
what  is  referred  to  in  this  chapter  as  a  generalized  y2  random  variable  (also  describes 
a  gamma-distributed  random  variable). 

As  an  example  of  a  generalized  y2  random  variable,  consider 

k  =  Y1  i^ij’ 

1=1 

where  g*  ~  CA/”(0,(T2),  i.e.  q,  is  circularly  symmetric  complex  Gaussian  distributed 
with  mean  zero  and  variance  a2.  This  implies  that  k  is  a  generalized  y2  random 
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variable  with 


k}  =  Hk  =  -Do-2 
var(/c)  =  a'l  =  2Da4 

The  number  of  degrees  of  freedom  (DoF)  of  a  y2  random  variable  is  equal  to  the 
number  of  independent  random  variables  in  the  sum.  The  number  of  independent 
random  variables  in  a  circularly  complex  Gaussian  random  variable  is  two:  one  for 
the  real  part  and  one  for  the  imaginary.  The  estimate  of  the  number  of  paths  will 
soon  be  shown  to  be  the  number  of  complex  circularly  symmetric  Gaussian  random 
variables  in  the  sum,  or  half  the  number  of  degrees  of  freedom  in  the  generalized  y2 
random  variable.  A  way  to  calculate  the  number  of  complex  random  variables  in  the 
generalized  y2  random  variable,  k,  is  [11] 

P,  =  \D  =  4-  (4-6.21) 

Using  the  Rayleigh  fading  model  to  create  a  generalized  y2  random  variable 

A  pertinent  example  of  a  generalized  y2  random  variable  is  the  sum  of  the  squared 
absolute  value  of  the  coefficients  of  a  Rayleigh  fading  channel,  i.e.  JA  \di\2-  When 
a  channel  is  Rayleigh  fading,  generalized  y2  analysis  can  be  used  to  determine  the 
number  of  channel  coefficients  which  equals  the  number  of  paths. 

Over  narrow  frequency  ranges,  the  underwater  communication  channel  appears 
approximately  Rayleigh  fading  [53].  The  narrowband  Rayleigh  fading  channel  model 
considered  in  this  section  characterizes  the  fluctuations  in  each  path  using  one  random 
variable, 

g i(t)  =  gi(t)v(9i)5(t  -  Ti).  (4.6.22) 

In  the  above  expression,  fft)  is  a  circularly-symmetric  complex- Gaussian  random 
process,  v(0*)  is  an  array  manifold  vector  with  angle  of  arrival  9i,  and  S(t  —  Ti)  is 
the  path  delay.  When  using  a  discrete  time  system,  the  channel  is  sampled  at  times 
t  =  mT,  where  T  is  the  sampling  period  and  m  is  an  integer. 


(4.6.20) 
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Recall  from  eq.  (4.4.3)  that  the  received  signal  has  the  form 


u  [n]  =  $>Gp[n]d'[n\  +  v[n\  (4.6.23) 

Several  simplifying  assumptions  are  made  throughout  this  section  to  clarify  the 
analysis: 

1.  The  path  delays,  rt,  are  multiples  of  the  sampling  period  and  no  two  paths 
have  the  same  delay.  These  conditions  imply  that  there  is  exactly  one  nonzero 
coefficient  per  column  of  Gp  and  at  most  one  nonzero  coefficient  per  row. 

2.  The  channel  path  coefficients,  gt,  are  IID  and  distributed  as  g  i  rs"/  CJ\f(0,a2g). 

3.  Each  sensor  observes  the  same  channel,  Gp[n].  This  implies  a  the  array  is  short 
enough  so  there  is  not  significant  path  length  difference  from  top  to  bottom  for 
any  observed  paths. 

4.  The  observation  noise  is  spatially  and  temporally  white  with  zero  mean  and 
variance  a2. 

5.  The  transmitted  data  symbols  are  IID  with  zero  mean  and  unit  variance  [i.e. 

*2  =  i)- 

The  first  assumption  is  very  approximate,  but  allows  for  some  interesting  results 
that  are  supported  by  the  data.  The  next  two  assumptions  simplify  the  analysis 
significantly  and  the  effect  of  relaxing  these  assumptions  is  discussed  later  in  the 
section.  The  fourth  assumption  is  accurate  in  the  high  SNR  regime,  but  the  observed 
results  do  not  appear  to  be  sensitive  to  this  assumption.  The  last  assumption  is 
standard  in  communications  research. 

To  determine  the  number  of  multipath  components,  a  functional  of  the  received 
data  vector  is  created, 

F[n]  =  uH[n]u[n].  (4.6.24) 
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Substituting  the  channel  model  from  eq.  (4.6.23)  into  this  functional  gives 

F[n]  =  (Q>Gp[n\d'[n]  +  v[n\)H (<$>Gp  [n]d' [n]  +  u[n]).  (4.6.25) 

After  performing  the  multiplications  the  expanded  version  of  the  function  is 

F[n]  =  d'H[n\Gp[n]<$>H<f>G”  [n]d'[n]  (4.6.26) 

+  vH[n]<$>Gp[n\d'[n]  +  d,H[n\Gp[n\<&Hv  +  vH[n\u[n\. 

In  the  above  expression,  the  vector  of  transmitted  symbols,  d/[n],  has  time- 
invariant  statistics  and  the  symbols  are  IID.  Since  the  data  is  independent  of  both 
the  channel  and  the  noise,  terms  including  d7[n]  are  replaced  by  their  expected  value 
with  respect  to  the  transmitted  data  symbols.  Over  the  averaging  windows  of  the 
time  averaging  used  to  estimate  the  statistics  of  the  received  data,  this  assumption 
does  not  introduce  noticeable  errors.  Using  this  substitution,  eq.  (4.6.26)  becomes 

F[n]  =  Tr($Gf  [n\Gp[n]<$>H)  +  u[n\Hu[n\.  (4.6.27) 

The  first  term  is  found  using  the  identity  of  the  trace  operator,  Tr(AB)  =  Tr(BA). 
The  second  and  third  terms  from  eq.  (4.6.26)  vanish  since  the  transmitted  data  sym¬ 
bols  are  zero  mean  and  uncorrelated  with  the  observation  noise. 

Using  the  assumption  that  Gp[n]  contains  exactly  one  non- zero  entry  per  column 
and  at  most  one  non- zero  entry  per  row,  Gp[n]Gp[n\  is  the  diagonal  matrix 

\gi[n}\2  0  0  •••  0 

o  \g2[n]\2  0  •••  0 

Gp[n]Gp[n]  =  0  0  |^3[n]|2  0 

0  0  0  \gP[n}\2_ 

The  delay  index  on  the  channel  coefficients  is  not  shown  since  there  is  only  one  non- 
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zero  coefficient  per  path. 

Recall  that  the  ith  column  of  the  array  manifold  matrix,  <f»,  is  the  array  manifold 
vectors  for  a  linear  array,  v(0*).  Using  the  relation,  vi/(0j)v(0i)  =  K,  eq.  (4.6.27) 
becomes 

p 

Ti'(#G"[n]Gp[n]#H)  =  KJ2  IffiHI2- 

i= 1 

All  of  these  assumptions  and  matrix  manipulations  lead  to  a  usable  form  of  the 
functional 

P  I\ 

F[n\  =  KJ2  \9iH2  +  WM,  (4.6.28) 

i=l  j= 1 

where  uH[n]u[n]  is  written  using  sum  notation.  Both  the  channel  coefficients,  g*,  and 
the  observation  noise  terms,  zg,  are  circularly-symmetric,  complex  Gaussian  random 
variables.  Therefore,  the  functional  F[n]  is  the  sum  of  two  generalized  y2  random 
variables:  one  for  the  channel  coefficients  and  one  for  the  noise. 


Determining  the  number  of  beams  from  the  Rayleigh  fading  model 

Given  the  number  of  sensors  is  K  and  the  number  of  non-zero  channel  coefficients 
(he.  the  number  of  paths)  is  P ,  the  mean,  pF,  and  variance,  a2F)  of  the  functional, 
F[n\i  are 


fiF  =  FPa2g  +  Ka2  (4.6.29) 

a2F  =  2K2P<7g  +  2  Ka$.  (4.6.30) 


The  inverse  signal  to  noise  ratio  is  defined  as 


P 


err 


(4.6.31) 


Applying  the  same  ratio  test  to  determine  the  number  of  complex  components  of  a 
generalized  y2  random  variable  from  eq.  (4.6.21)  to  the  functional,  F[n]  produces  the 
relation 


/4  K(P  +  p )2 
aF  KP  +  p2  ' 


(4.6.32) 
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There  are  three  interesting  regions  of  Psnr  for  the  estimated  path  count,  Pp:  high 
SNR  (p  fa  oo),  low  SNR  (p  ~  0),  and  the  maximum  path  count  estimate  which  occurs 
when  p  =  K.  In  the  high  SNR  region,  the  parameter  p  — >  0,  so  Pp  ~  P,  i.e.  the  path 
count  estimate  equals  the  number  of  paths.  In  the  low  SNR  region,  the  parameter 
p  — >•  oo,  so  Pp  ~  K,  the  number  of  sensors  since  the  noise  space  is  full  rank. 

For  the  same  channel  statistics  as  the  SNR  is  decreased  the  estimated  number  of 
paths  increases.  This  could  be  desirable  feature  of  this  estimator  since  as  the  SNR  is 
decreased,  the  estimator  allows  for  more  beams  which  implies  better  noise  immunity 
(if  the  noise  on  each  beam  is  assumed  to  be  independent).  More  beams  also  implies 
more  parameters  to  estimate  so  components  with  more  variability  are  not  able  to 
be  tracked.  Thus,  this  behavior  of  the  estimator  should  be  taken  into  account  when 
designing  systems  which  use  y2  methods  for  estimating  the  number  of  paths. 

The  DoF  count  estimate  is  not  a  monotonic  function  of  Psnr,  but  the  relation 

PF<P  +  K  (4.6.33) 


can  be  verified  by  substitution.  Equality  is  achieved  when  p  =  K.  Thus,  Pp  can  be 
greater  than  the  number  of  sensors,  but  only  for  a  very  low  SNR  {i.e.  p  ~  K). 

Another  way  to  estimate  the  degrees  of  freedom  in  a  y2  random  variable  is  pro¬ 
posed  in  [6,  11,  113,  12]  based  on  the  received  signal  correlation  matrix  Ru.  In  this 
approach,  the  estimated  number  of  degrees  of  freedom  is 


(Tr(Ru))2  _  (EtiAi) 

Tr(Ru)  E£i  \2 


(4.6.34) 


where  A,  is  the  ith  eigenvalue  of  the  received  signal  correlation  matrix.  In  the  liter¬ 
ature  this  method  assumes  that  the  received  data  vector  u [n]  is  a  Gaussian  random 
vector,  where  each  element  is  zero-mean  and  unit  variance,  which  is  not  the  case  for 
the  underwater  communication  problem.  This  function  of  the  eigenvalues,  however, 
turns  out  to  be  a  reasonable  estimator  of  the  number  of  paths  in  the  underwater 
channel;  The  numerator  and  the  denominator  are  evaluated  independently  to  justify 


115 


that  eq.  (4.6.34)  is  a  good  estimator  of  the  number  of  paths. 


The  trace  of  the  received  signal  correlation  matrix  is  equivalent  to  the  mean  of 
F[n],  through  the  relation 

E  {F[n}}  =  E{u^[n]u[n]}  =  E{Tr(u[n]u^[n])}  =  Tr(RJ.  (4.6.35) 

Thus,  the  numerator  of  Pe f  is  the  square  of  the  expected  value  of  F[n],  which  is  also 
the  numerator  of  Pp. 

The  denominator  of  eq.  (4.6.34)  is  more  complicated  to  justify.  A  Erst  step  is 
to  substitute  the  received  signal  model  from  eq.  (4.6.23)  into  the  expression  for  the 
received  signal  correlation  matrix, 

Ru  =  E{u[n]uff  [n]}  =  E{(<&Gp[n]d,[7i]  +  u[n])(Q>Gp[n}d'[n]  +  u[n])H}.  (4.6.36) 

Evaluating  the  expectation  with  respect  to  the  transmitted  symbols,  the  noise,  and 
the  channel  coefficients,  the  received  signal  correlation  matrix  is 

Ru  =  a2g<f><f>H  +  a2I,  (4.6.37) 

where  the  noise  is  still  assumed  to  be  spatially  and  temporally  white.  Since  E{F[n]}  = 
Tr(RJ,  the  terms  in  eq.  (4.6.37)  correspond  exactly  to  the  terms  in  eq.  (4.6.29). 
Comparing  terms,  the  trace  of  the  matrix  product,  <&<h/7  is 

Tr($$H)  =  KP.  (4.6.38) 

Using  eq.  (4.6.37)  and  eq.  (4.6.38)  the  expression  for  Tr(R2)  becomes 

Tr(R2)  =  Tr((ag2$^  +  ^I)2) 

=  a4gaef  +  2KPa2ga2v  +  Ka4v.  (4.6.39) 
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In  this  expression,  the  quantity  aef  is  defined  as 


aef  =  Tr  (($$H)2)  .  (4.6.40) 

Recall  that  the  columns  of  the  matrix  $  are  array  manifold  vectors,  which  can  be 
parametrized  by  an  AoA  9.  The  ith  column  of  $  is  denoted  as  v(9i).  The  expression 
for  ae f  is 

p  p  p- 1  p 

£**  =  S|vH(0fc)v(0<)|2  =  JRr2P  +  2^  ^  \vH(9k)v(9e)\2  =  K2P  +  2let. 

k= 1  1=1  k= 1  £=fc+l 

(4.6.41) 

The  expression  for  yef  depends  on  the  angles  6i,...,9p  which  depend  on  the 
particulars  of  the  communication  channel.  Thus  there  is  no  way  to  further  simplify 
the  expression  for  yef.  The  parameter  yef,  however,  can  be  bounded  by 

0<7ef<^2P(P-l), 


which  implies  bounds  for  aef, 


Ii2P  <  ae{  <  K2P2. 


The  lower  bound  is  achieved  when  all  of  the  array  manifold  vectors  are  orthogonal, 
i.e.  vH (9f.)v(9i)  =  0  when  k  ^  t.  The  upper  bound  is  achieved  when  all  of  the  array 
manifold  vectors  are  the  same,  i.e.  v(#i)  =  v(^)  =  •  •  •  =  v(6p). 

Using  the  derived  expressions  for  the  numerator  and  denominator,  the  estimate 
of  the  arrivals,  Pef  is 


(Tr(Ru))2  _  K2(P  +  p) 
Tr(R2)  arf  +  2KPp  +  Kp 2 


In  the  low-SNR  region  (p  — >  oo),  regardless  of  the  path  AoA,  the  number  of 
arrivals  estimate  Pef  ~  K .  In  the  high  SNR  region  (p  — >■  0),  the  result  depends  on 
the  alignment  of  the  array  manifold  vectors.  When  the  array  manifold  vectors  are 
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orthogonal  Pe f  «  P.  When  the  array  manifold  vectors  are  not  orthogonal  in  the  high 
SNR  region,  then  Pef  <  P.  In  the  degenerate  case  when  all  of  the  array  manifold 
vectors  are  equal,  Pef  =  1. 

The  dependence  of  Pef  on  the  AoA  in  the  high  SNR  regime  has  some  nice  inter¬ 
pretations  and  nice  results.  When  two  AoA  are  nearly  equal,  the  DoF  count  estimate 
is  reduced  since  fewer  beams  are  needed  to  capture  energetic  paths.  This  is  generally 
a  positive  results  since  highly  time-varying  paths  will  likely  be  ignored.  Reducing 
the  number  of  beams  does  place  additional  constraints  on  DFE  since  there  are  fewer 
adaptive  parameters,  so  the  end  effect  on  residual  error  is  unclear. 

Pef  is  a  monotonically  decreasing  function  of  the  SNR,  so  there  is  no  local  max¬ 
imum  like  there  was  for  Pp.  As  the  SNR  decreases,  there  is  a  smooth  transition  of 
the  number  of  paths  estimated  from  the  high  SNR  estimate  to  K. 

An  additional  result  is  that  the  term  2Tr(R^)  is  an  upper-bound  of  a2Fl 

4  =  K2Pa4  +  Ka4u 

<  K2Pa4  +  2K Per2 al  +  Ka4 

<  ae{ a4  +  2KPagal  +  Ka4 

=  Tr(R^).  (4.6.43) 

The  expression  in  the  second  line  is  the  lower  bound  of  Tr(R^).  Equality  of  all 
terms  is  achieved  when  p  =  oo,  i.e.  at  very  low  SNR.  In  other  SNR  regions,  aF  < 
Tr(R^).  Using  the  equality  relation  from  eq.  (4.6.35)  and  the  inequality  relation  from 
eq.  (4.6.43),  the  number  of  arrival  estimators  are  related  by 

Pei  <  PF.  (4.6.44) 

Estimating  the  number  of  paths  from  the  received  signal 

The  true  statistics  of  the  received  signal  and  noise  are  not  known  and  must  be  es¬ 
timated  from  the  received  signal.  In  this  subsection,  estimators  based  on  estimated 
statistics  are  presented. 
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One  method  for  estimating  the  number  of  paths  from  the  data  is  to  directly 
estimate  the  statistics  of  F[n\,  a  modified  version  of  the  method  of  moments  from 
[6,  11].  Replacing  the  mean  and  variance  of  F[n\  in  eq.  (4.6.32)  with  estimates  based 
on  a  length  M  sliding  window  gives  the  estimate  of  the  number  of  arrivals, 


Pf  = 


it  Eir  F[n  ~  i] 


^r=i^[n-i])-UEr=iF[n-i 


(4.6.45) 


This  method  circumvents  the  need  for  explicit  calculation  of  the  PDF  of  F  but  there 
is  no  guaranteed  upper  bound  on  the  estimated  number  of  paths. 

A  second  method  for  estimating  the  number  of  paths  from  the  received  signal  is 
to  create  an  estimate  of  the  received  signal  correlation  matrix,  Ru.  This  estimator 
uses  the  function  of  the  eigenvalues  from  eq.  (4.6.34),  with  the  estimated  correlation 
matrix  eigenvalues,  A*,  in  place  of  the  true  correlation  matrix  eigenvalues,  This 
estimator  is 


(Ti~(Ru))2  (EfiP,:)2 

Tr(RJ 


(4.6.46) 


The  last  section  stated  that  A*  is  the  maximum  likelihood  estimate  of  A*.  Therefore, 
the  path  count  estimate  Pe f  is  a  maximum  likelihood  estimate  of  Pe f.  Due  to  the  way 
they  are  computed  the  number  of  arrival  estimates  using  both  both  the  true  eigen- 
structure  Pe f  and  the  estimated  Pef  are  guaranteed  to  be  within  1  <  Pef,Pef  <  K. 


Effectiveness  of  y2  methods  for  determining  number  of  observed  paths 

The  use  of  generalized  y2  techniques  for  creating  number  of  path  estimators  has 
distinct  advantages  over  the  information  theoretic  methods.  There  is  not  the  same 
requirement  that  AoA  for  the  different  paths  are  time-invariant  when  using  gener¬ 
alized  y2  techniques:  the  method  of  moments  estimator,  Pp  «  Pp,  has  no  explicit 
dependence  on  the  AoA  and  the  eigenvalue  ratio  estimate,  Pef  ~  Pef,  is  upper  bounded 
by  the  number  of  paths  in  the  high  SNR  region. 

The  generalized  y2  methods  have  less  dependence  on  the  variation  of  the  AoA 
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than  the  information  theoretic  methods  since  both  rely  on  eigenvalues  of  the  received 
signal  correlation  matrix.  The  information  theoretic  methods,  however,  are  based  on 
a  successive  ratio  test  [121],  so  the  change  in  the  individual  eigenvalues  (caused  by  a 
change  of  an  AoA)  can  drastically  change  the  estimate.  By  contrast,  the  generalized 
X2  rely  only  on  aggregate  eigenvalues,  so  individual  eigenvalues  (and  thus  the  changes 
in  AoA)  play  a  decreased  role  in  determining  the  number  of  paths. 

There  are  some  things  to  keep  in  mind  when  using  generalized  y2  analysis  for 
underwater  channels.  First,  the  underwater  channel  is  only  approximately  Rayleigh 
fading  [52,  130],  so  the  proposed  channel  model  can  be  inaccurate.  This  causes  some 
degradation  in  the  performance  [11],  but  the  effect  can  be  mitigated  by  using  narrower 
DFT  bins  so  that  the  Rayleigh  fading  assumption  is  more  accurate  [53].  Also,  each 
of  the  multipath  arrivals  does  not  have  the  same  energy,  which  reduces  the  accuracy 
of  the  estimator.  Fortunately,  eighty  to  ninety  percent  of  the  degrees  of  freedom  are 
captured  using  y2  types  of  estimation  methods  [39]. 

4.7  Experimental  evidence 

In  this  section,  experimental  evidence  is  presented  for  methods  of  computing  the 
beam  weights  and  the  number  of  beams  to  use. 

4.7.1  SPACE08  experiment  setup 

The  SPACE08  was  performed  off  the  coast  of  Martha’s  Vineyard,  MA  from  Oct.  14th 
through  Nov.  1st,  2008.  The  water  depth  was  approximately  15  meters,  the  transmit¬ 
ter  was  approximately  4  meters  from  the  sea  floor,  and  the  top  of  the  receive  arrays 
were  about  3.25  meters  above  the  sea  floor.  Figure  4.7.1  illustrates  the  experimental 
setup. 

The  carrier  frequency  was  fc  =  12.5  kHz,  the  bandwidth  was  B  =  6.51  kHz,  and 
the  sampling  frequency  was  fs  =  39.06  kHz.  The  transmitted  signal  was  the  first 
20,000  symbols  of  a  repeated  binary  phase  shift  keyed  (BPSK)  encoded,  4095-length 
M-sequence. 
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Figure  4.7.1:  Setup  of  SPACE08  experiment 

Before  processing,  the  carrier  was  removed,  the  data  was  low-pass  filtered,  and 
the  data  was  down-sampled  to  two  samples  per  symbol.  Time  alignment  of  the  signal 
was  achieved  through  the  use  of  an  M-sequence  timing  signal  at  the  beginning  of  the 
packet. 

The  receiver  was  a  24-element  vertical  array  with  5  cm  element  spacing  placed. 
The  array  was  200  meters  from  the  transmitter  to  the  southwest. 

Every  two  hours,  all  of  the  acoustic  signals  being  tested  for  the  experiment  were 
sent  sequentially.  This  entire  collection  of  signals  made  up  one  time  epoch,  which  are 
referred  to  by  the  time  and  Julian  day  they  were  transmitted.  Each  epoch  is  labelled 
by  the  Julian  date  and  time  of  the  start  of  transmission  of  its  first  packet. 

4.7.2  Comparing  methods  for  computing  beam  weights 

Three  epochs  were  chosen  from  different  days  with  a  variety  of  sea  surface  conditions: 
day  297  at  time  1800,  day  294  at  time  1200,  and  day  300  at  time  0800.  These  three 
epochs  range  from  calm  on  day  297  to  high  stormy  seas  on  day  300.  Each  of  the 
methods  described  in  Section  4.5  was  tested  for  each  one  of  these  epochs,  as  was  the 
full  sensor-space  processing  and  sensor-space  processing  using  a  number  of  sensors 
equal  to  the  number  of  beams. 

The  DFE  parameters  were  chosen  as  follows:  the  number  of  acausal  coefficients 
in  each  feedforward  section  is  La  =  10.  The  number  of  causal  coefficients  in  each 
feedforward  section  is  Lc  =  40.  The  number  of  feedback  filter  coefficients  is  Lfb  =  23. 
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This  allowed  the  feedforward  filter  to  capture  most  of  the  signal  energy  and  the 
feedback  filter  to  cancel  most  of  the  ISI.  The  received  signal  was  sampled  twice  per 
symbol  (he.  the  fractional  sampling  rate  was  2). 

An  EW-RLS  algorithm  was  used  to  estimate  the  DA-DFE  equalizer  taps.  In  order 
to  ensure  that  the  observed  results  were  not  an  artifact  of  the  choice  of  the  exponential 
weighting  factor,  A,  seven  different  values  were  tested  from  A  =  0.996  to  A  =  0.9995. 
The  minimum  mean  squared  error  results  across  all  A  are  shown  in  the  figures  below. 

Using  the  ray-path  modeled  described  in  Section  4.3  with  the  experiment  geom¬ 
etry,  the  number  of  significant  arrivals  was  estimated  to  be  seven.  Thus,  7  beams 
were  created  using  each  of  the  proposed  methods  for  determining  the  beam  weights 
(except  for  the  time-aligned  uniform  beams  which  only  used  5).  The  broadside  angle 
is  defined  to  be  at  90°  with  the  surface  at  0°  and  the  seafloor  at  180°.  The  elevation 
angles  examined  for  the  beamspace  methods  were  from  75.5°  to  104.5°.  For  the  DPSS 
method,  the  angular  spread  used  was  70.3°  to  109.7°. 

For  the  hybrid  methods,  the  number  of  beams  for  the  initial  beamformer  was 
chosen  to  be  12.  This  number  was  chosen  because  it  was  higher  than  the  estimated 
number  of  arrivals  but  was  much  less  than  the  number  of  sensors. 

The  mean  squared  error  (MSE)  at  the  output  of  the  equalizer,  is  ,  Cmse,  was  the 
metric  used  to  compare  the  different  methods.  This  MSE  is 


csde  — 


1  |  d[n] 

N ^ 

n= 1 


d[n]\‘ 


|  d[n] 


(4.7.1) 


where  d[n]  is  the  transmitted  symbol  and  d[n]  is  the  filtered  data,  before  the  symbol 
decision. 

Due  to  the  large  array  gain,  the  bit  error  rate  (BER.)  was  nearly  zero,  even  as  the 
SNR.  was  degraded  in  all  examined  cases.  The  SNR.  was  degraded  by  adding  a  scaled 
version  of  an  ambient  noise  signal  recorded  during  a  silent  period  in  the  epoch. 

Figure  4.7.3  shows  the  results  from  the  day  297,  time  1800  epoch,  day  294,  time 
1200  epoch,  and  day  300,  time  0800  epoch.  The  input  SNR  is  the  ratio  of  the  measured 
incoherent  signal  energy  to  noise  energy  before  equalization.  The  observed  input  SNR. 
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Table  4.7.1:  Input  signal  to  noise  ratio  for  each  of  the  data  epochs  examined  from 
the  SPACE08  experiment. 


Epoch 

Input  SNR 

Day  297,  time  1800 

41.2  dB 

Day  294,  time  1200 

37.2  dB 

Day  300,  time  0800 

35.2  dB 

for  each  epoch  is  given  in  Table  4.7.1. 

The  results  show  that  the  adaptive  methods  and  the  hybrid  adaptive  generally 
outperform  the  non-adaptive  methods  and  generally  the  DPSS  methods  outperform 
the  other  non-adaptive  methods.  As  the  sea  surface  conditions  become  rougher, 
adaptive  methods  tend  to  have  the  lowest  MSE.  The  non-adaptive  methods  have  an 
assumed  angular  spread  that  is  violated  when  the  sea-surface  becomes  too  rough. 

The  performance  of  the  reduced  sensor  adaptive  method  (which  uses  12  sensors  to 
create  the  7  adaptive  beams)  reduces  relative  to  the  non-adaptive  hybrid  methods  as 
sea  surface  condition  become  rougher.  This  result  implies  that  using  hybrid  methods 
provides  less  sensitivity  to  surface  conditions  than  simply  ignoring  sensor  data. 

The  MCM  and  MVDR  methods  are  the  only  methods  studied  which  takes  the 
observed  noise  correlation  structure  into  account,  so  one  would  expect  them  to  out¬ 
perform  the  other  methods.  The  reason  they  do  not  is  that  methods  such  as  MVDR 
and  MCM  were  designed  with  the  goal  of  rejecting  unwanted  signals.  The  goal  of  a 
communication  system  is  to  accept  as  many  desirable  signals  as  possible.  That  the 
MCM  method  is  slightly  worse  than  that  of  the  uniform  beams  is  not  surprising  since 
the  uniform  beams  gather  more  energy  from  the  angular  range  of  interest  than  the 
MCM  method. 

For  comparison  of  computationally  similar  methods,  seven  of  the  twenty-four  sen¬ 
sors  were  used  as  the  input  to  a  multichannel  equalizer.  Several  configurations  of  the 
seven  sensors  were  tested  and  the  best  configuration  for  each  epoch  is  shown  on  the 
figures.  In  all  cases,  the  best  seven  sensors  perform  at  least  2  dB  worse  than  either  the 
proposed  methods  or  the  full  sensor  space.  Thus,  for  the  same  computational  com¬ 
plexity,  the  proposed  methods  improve  the  performance  dramatically  and  compete 
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Figure  4.7.2:  Comparison  of  beamforming  methods  using  SPACE08  experimental 
data.  The  left  column  ((a),  (c),  and  (e))  contains  the  beamspace  and  adaptive  meth¬ 
ods.  The  right  column  ((b),  (d),  and  (f))  contains  the  non-adaptive  methods,  (a) 
and  (b)  are  results  using  data  taken  on  day  297,  time  1800,  calm  conditions,  (c)  and 
(d)  are  results  using  data  taken  on  day  294,  time  1200,  smooth,  rolling  waves,  (e) 
and  (f)  are  results  using  data  taken  on  day  300,  time  0800,  very  stormy  conditions. 
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Figure  4.7.3:  Results  from  SPACE08  comparing  the  beamspace,  adaptive,  non- 
adaptive,  and  hybrid  methods  for  three  sea  surface  conditions:  (a)  calm  [day  297, 
time  1800]  (b)  rolling  waves  [day  294,  time  1200]  and  (c)  [day  300,  time  0800].  In  all 
three  cases,  the  relative  performance  of  the  beamspace  processing  is  reduced  as  the 
SNR  is  reduced.  For  the  other  methods  the  performance  is  approximately  equivalent 
with  the  adaptive  methods  having  the  best  performance  as  the  sea  surface  conditions 
become  rougher. 
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Table  4.7.2:  Description  of  beamforming  methods  labels. 


Legend  Entry 

Description 

All  Sensors 

Complete  sensor  space  using  all  available  sensors. 

Reduced  Sensors 

Sensor  space  using  7  best  sensors. 

Uniform 

7  uniformly  weighted  steered  beams  steered  to  null  of 
neighbor. 

Time-aligned 

Uniformly  weighted  beams  only  equalized  in  region  of  ex¬ 
pected  channel  energy. 

DPSS 

7  Discrete  prolate  spheroidal  beams  with  angle  span  calcu¬ 
lated  using  ray-path  model. 

Eig 

Beams  composed  of  eigenvectors  corresponding  to  largest  7 
principle  components  of  received  signal  correlation  matrix. 

MCM 

7  Multiple  constraint  beams  steered  in  same  directions  as 
uniform  weighted  beamformer. 

Fully  Adaptive 

Adaptive  beamformer  with  7  beams. 

ADP  Reduced 

Adaptive  beamformer  with  7  beams  using  only  12  sensors 

ADP  DPSS 

12  beam  DPSS  beamformer  followed  by  7  beam  adaptive 
beamformer. 

ADP  Eig 

12  beams  corresponding  to  largest  12  principle  components 
of  the  received  signal  correlation  matrix  followed  by  7  beam 
adaptive  beamformer. 
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Figure  4.7.4:  Procedure  for  estimating  the  subspace  dimension  from  data. 

quite  favorable  with  the  full  sensor  space  results. 

All  results  presented  in  this  section  are  sensitive  to  the  choice  of  the  exponential 
weighting  factor,  A.  Finding  the  optimal  A  for  a  given  channel  is  beyond  the  scope 
of  this  work.  Further  work  on  adaptively  tracking  the  optimal  exponential  weighting 
factors  is  merited. 

4.7.3  Comparing  methods  for  estimating  number  of  beams 

In  this  subsection,  the  methods  for  computing  the  number  of  beams  are  compared 
using  experimental  data.  This  data  used  is  taken  from  the  SPACE08  experiment, 
from  the  day  297,  time  1800  epoch. 

To  determine  the  number  of  beams,  the  data  is  windowed  using  a  Hann  window 
(to  reduce  side-lobe  levels)  and  then  passed  through  a  discrete  Fourier  transform. 
The  statistics  and  correlation  matrix  are  then  calculated  for  each  temporal  frequency 
bin  which  are  then  used  to  estimate  the  number  of  arrivals.  This  process  is  illustrated 
in  Figure  4.7.4. 

One  way  to  increase  the  effective  averaging  times  is  to  average  several  estimates 
together.  In  figure  4.7.5  estimates  are  shown  for  temporal  frequencies  from  0  to  fs/2 
for  four  different  averaging  windows:  50,  100,  500,  and  1000  sample  windows.  The 
carrier  frequency  is  shown  in  a  thick  dashed  lines  and  the  signal  bandwidth  is  shown 
in  black  dash-dot  lines.  A  512  point  DFT  was  used  with  an  overlap  of  256  samples  and 
a  total  data  block  length  of  500,000  samples.  The  carrier  frequency  is  fc  =  12.5  kHz 
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Table  4.7.3:  Estimated  number  of  arrival  estimates  across  bandwidth  of  transmitted 
signal  (SPACE08  data).  The  averaging  window  indicates  the  amount  of  data  used  to 
make  an  individual  estimate.  The  estimates  are  averaged  across  the  entire  500,000 
sample  data-set.  The  window  overlap  amount  is  half  of  the  averaging  window  length. 


Estimation  Method 


Data  Set 

Ave.  Win 

AIC 

BIC 

DoF  -  MM 

DoF  -  Eig 

50 

22.18 

20.06 

2.33 

2.14 

Day  297,  Time  1800 

100 

22.39 

20.84 

2.83 

1.96 

500 

22.85 

22.42 

4.34 

1.80 

1000 

22.90 

22.71 

4.82 

1.77 

and  the  sampling  frequency  is  fs  =  39.06  kHz. 

The  subspace  estimation  results  show  a  big  difference  between  the  information 
theoretic  methods  and  the  other  methods.  The  reason  for  the  gap  between  the  meth¬ 
ods  is  due  to  channel  motion,  which  causes  the  AIC  and  BIG  methods  to  produce  a 
high  estimate  of  the  signal  subspace  due  to  averaging  of  the  moving  signal. 

To  determine  which,  if  any,  of  the  studied  methods  effectively  estimates  the  useful 
number  of  beams,  the  MSE  at  the  output  of  the  multichannel  DFE  is  compared 
when  different  numbers  of  beams  are  used.  This  experiment  is  performed  for  two  of 
proposed  beam-weight  methods:  the  DPSS  beams  and  the  fully  adaptive  beams.  The 
results  are  shown  in  Figure  4.7.6. 

The  results  shown  in  figure  4.7.5  for  show  a  leveling  off  around  the  number  of 
beams  corresponding  to  the  number  of  arrivals  predicted  by  the  ray-path  model.  The 
knee  of  the  curves  occurs  before  this  value  and  appears  to  be  well  estimated  by  the 
X2  methods.  The  estimates  produced  using  the  AIC  and  BIG  methods  appear  to  be 
much  higher  than  the  data  indicates  are  useful. 

At  high  SNR,  the  DPSS  method  continues  to  improve  slightly  as  the  number 
of  beams  is  continually  increased,  but  at  low  SNR  both  methods  show  a  distinct 
leveling  off  at  around  seven  beams.  The  y2  methods  provide  a  reliable,  data  driven 
method  for  computing  the  number  of  beams  needed  to  achieve  good  performance  in 
this  experiment. 

An  interesting  feature  of  these  plots  is  that  the  number  of  beams  used  could  also 
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fc'l 


(d) 

Figure  4.7.5:  Number  of  arrival  estimation  for  data  gathered  on  day  297,  time  1800 
at  the  SPACE08  experiment.  All  results  presented  use  500,  000  signal  sample,  with 
6  samples  per  transmitted  symbol.  Four  different  methods  are  presented:  AIC  is  the 
Akaike  information  criterion,  BIC  is  the  Bayesian  Information  Criterion,  DoF  is  the 
generalized  y2  method  using  the  correlation  matrix,  and  DoFMM  is  generalized  x2  the 
method  of  moments.  Different  averaging  windows  are  used:  (a)  length  50  averaging 
window,  (b)  length  100  averaging  window,  (c)  length  500  averaging  window,  and  (d) 
length  1000  averaging  window,  samples  is  used  and  in  (b)  an  averaging  window  of 
1000  samples  is  used.  There  was  an  overlap  length  of  half  the  averaging  window. 
The  solid  black  line  shows  the  carrier  frequency  and  the  dash-dot  lines  show  the 
transmitted  signal  bandwidth. 
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(a) 


(b) 

Figure  4.7.6:  Comparison  of  mean  squared  error  verses  the  number  of  beams  for  (a) 
DPSS  beamformer  and  (b)  Fully  Adaptive  beamformer.  The  black  dashed  line  is  the 
number  of  arrivals  estimated  by  the  ray-path  model. 
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be  reduced  without  too  much  loss  in  performance.  Thus,  if  computational  complexity 
is  of  chief  concern,  and  often  it  is,  the  number  of  beams  could  be  reduced,  especially 
at  high  SNR,  to  as  low  as  three  beams  with  only  about  ldB  loss  of  performance. 

4.7.4  MACE10  experiment  setup 

The  SPACE08  was  performed  off  the  coast  of  Martha’s  Vineyard,  MA  on  July  23, 
2010.  The  signals  were  transmitted  at  a  depth  of  around  80m  to  an  12  element 
vertical  array  receiver  with  5  cm  element  spacing,  which  was  attached  to  a  buoy.  The 
transmit  hydrophone  was  an  ITC-1007,  which  was  towed  from  a  distance  of  500  m 
from  the  receive  array  to  a  distance  of  4500m  and  back.  There  were  two  tows  in  this 
experiment.  The  data  presented  is  from  the  second  tow. 

The  carrier  frequency  of  the  transmitted  signal  was  fc  =  13  kHz,  the  bandwidth 
was  B  =  4.88  kHz,  and  the  sampling  frequency  was  fs  =  39.06  kHz.  The  transmitted 
signal  was  the  first  50,000  symbols  of  a  repeated  binary  phase  shift  keyed  (BPSK) 
encoded,  2047-length  M-sequence. 

4.7.5  MACE10  experimental  results 

The  MACE10  experiment  is  very  similar  to  the  SPACE08  experiment  in  terms  of 
the  type  of  data  transmitted  and  the  hardware  used.  The  key  difference  between 
the  two  experiments  is  that  the  MACE10  experiment  had  a  moving  transmitted  (and 
stationary  receiver).  Thus,  there  is  more  channel  variability  in  the  MACE10  data 
compared  with  the  SPACE08  data. 

Figure  4.7.7  shows  the  results  of  the  various  beamforming  methods  on  the  MACE10 
data  set.  Two  beamforming  methods  were  not  tested  on  the  MACE10  data  set,  the 
reduced  adaptive  beamformer  (using  fewer  sensors  to  do  adaptive  processing)  and  the 
MCM  beamformer.  All  of  the  data  shown  here  used  5  beams  for  beamforming. 

In  figure  4.7.7,  the  adaptive  methods  all  have  a  rapid  decrease  in  performance  as 
the  SNR  is  lowered.  When  compared  with  the  non-adaptive  beamforming  methods, 
shown  in  Figure  4.7.8.  This  reduction  in  performance  is  dne  to  a  sudden  instability 
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Figure  4.7.7:  Comparison  of  beamforming  methods  using  MACE10  experimental 
data.  The  left  column  ((a),  and  (c))  contains  the  beamspace  and  adaptive  meth¬ 
ods.  The  right  column  ((b)  and  (d))  contains  the  non-adaptive  methods.  The  two 
rows  represent  different  data  sets,  taken  two  minutes.  The  first  column  was  taken 
at  time  1810,  and  the  second  at  time  1812.  Note  that  the  adaptive  methods  have  a 
significant,  relative  loss  of  performance  at  low  SNR. 
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Figure  4.7.8:  Results  from  MACE10  comparing  the  beamspace,  adaptive,  non- 
adaptive,  and  hybrid  methods  for  two  different  data  packets:  (a)  at  time  1810  (b) 
at  time  1812.  Both  of  these  results  show  the  relative  performance  degradation  of 
the  adaptive  results  compared  with  the  non-adaptive  results.  This  is  due  to  the 
instabilities  in  the  adaptive  algorithm  forcing  the  use  of  longer  averaging  windows. 


within  the  adaptive  algorithm.  This  results  in  a  jump  in  the  mean  square  error 
observed  without  any  immediate  correction.  This  effect  can  be  seen  in  Figure  4.7.9. 

From  the  results,  there  is  an  apparent  problem  with  the  adaptive  beamforming 
methods  for  certain  values  of  the  parameters.  Further  investigation  is  needed  to  fully 
characterize  this  phenomenon  and  to  determine  mitigating  techniques. 


4.8  Conclusions 

This  chapter  investigated  beamspace  processing  for  a  multichannel  DFE.  Assuming 
white  spatial  and  temporal  observation  noise,  the  optimal  beams  for  unknown  but 
bounded  arrival  angles  were  found  to  be  the  discrete  prolate  spheroidal  sequences. 
This  set  of  beamforming  weights  was  observed  to  be  very  tolerant  to  environmental 
mismatch  since  multiple  beams  covered  the  same  angular  range. 

A  variety  of  other  beamforming  methods  were  proposed  for  comparison  with  the 
DPSS  beams.  These  include  uniform  beams,  time-aligned  uniform  beams,  MVDR 
(MCM)  beams,  adaptive  beams,  and  hybrid  methods.  Using  the  SPACE08  experi- 
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Figure  4.7.9:  Results  from  MACE10  comparing  MSE  for  the  data  set  from  time  1812 
at  an  SNR  of  lOdB  with  A  =  0.996.  (a)  shows  the  adaptive  and  beamspace  results, 
(b)  shows  the  non-adaptive  methods,  and  (c)  compares  the  two.  Notice  that  all  of 
the  adaptive  methods  have  a  point  where  the  algorithm  becomes  unstable  and  the 
estimates  are  no  longer  valid. 
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mental  data,  all  of  the  proposed  methods  were  shown  to  have  similar  mean  squared 
error  performance  on  calm  days  and  the  adaptive  methods,  previously  proposed  by 
Stojanovic  et  al,  outperform  all  other  methods  when  the  channel  was  quickly  varying. 
The  adaptive  methods  outperformed  the  non-adaptive  methods  in  this  case  because 
the  adaptive  beamformer  could  adaptively  change  the  effective  number  of  beams  and 
stormy  conditions  caused  the  observed  angle  bound  to  be  larger  than  the  bound  used 
to  compute  the  non-adaptive  beams. 

The  proposed  method  of  using  DPSS  beam  weights  had  lower  computational  com¬ 
plexity  than  the  fully  adaptive  approach  proposed  by  Stojanovic  because  the  beam 
weights  can  be  computed  offline.  Furthermore,  the  proposed  non-adaptive  methods 
did  not  exhibit  the  same  non-linear  instabilities  observed  when  using  the  fully  adap¬ 
tive  approach. 

Several  methods  were  investigated  for  estimating  the  number  of  arrivals  from  the 
data  including  information  theoretic  methods  (AIC  and  BIC),  generalized  y2  methods 
for  finding  the  numbers  of  degrees  of  freedom,  and  ray-path  modeling  of  the  multipath 
arrivals  to  calculate  the  number  of  arrivals.  Derivations  were  shown  detailing  the 
effectiveness  of  generalized  y2  methods  for  determining  the  number  of  beams  to  use. 

The  methods  for  determining  the  number  of  beams  were  compared  using  the 
SPACE08  experimental  data.  The  generalized  y2  methods  were  found  to  have  the 
best  estimate  of  the  appropriate  number  of  beams  to  use  based  on  the  mean  squared 
error.  The  ray-path  modeling  methods  also  provided  a  reasonable  estimate,  but  one 
that  is  slightly  higher  than  is  necessary  for  the  data  studied. 
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Chapter  5 


The  effect  of  fixing  channel  model 
order  on  equalizer  performance 

5.1  Introduction 

Underwater  wireless  communication  hardware  is  designed  to  be  used  in  a  variety  of 
environments.  Without  knowledge  of  the  particular  underwater  operating  environ¬ 
ment,  some  parameters,  such  as  number  of  equalizer  coefficients  or  channel  delay 
spread,  may  need  to  be  hardwired  for  system  usability.  This  chapter  examines  the 
errors  introduced  when  the  number  of  coefficients  in  a  channel  model  used  to  cre¬ 
ate  equalizer  coefficients  differs  from  the  number  of  channel  coefficients  in  the  true 
channel.  The  analysis  presented  in  this  chapter  is  a  special  case  of  the  effective  noise 
analysis  from  Chapter  3.  The  structure  of  this  special  case  facilitates  a  special  com¬ 
pensation  algorithm  that  improves  performance  but  which  could  not  be  used  in  the 
more  general  case.  In  this  chapter,  the  channel  estimation  errors  are  due  to  model 
order  mismatch  and  the  observation  noise  is  ignored. 

The  reverberation  time  of  the  channel  can  range  from  less  than  ten  to  over  hun¬ 
dreds  of  milliseconds  and  can  change  over  time.  This  variation  can  make  estimating 
the  channel  length  very  challenging.  Underestimating  the  length  of  a  channel  can 
wreak  havoc  on  equalizer  performance  [58].  Fortunately,  the  noise  caused  by  using  a 
different  number  of  channel  coefficients  in  the  modeled  channel  than  there  are  in  the 
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true  channel  can  be  estimated  and  used  to  improve  equalizer  performance. 

This  chapter  focuses  on  analyzing  changes  in  equalizer  performance  due  to  channel 
length  estimation  errors.  Studies  to  date  on  this  topic  have  been  highly  empirical 
and  the  solutions  have  been  somewhat  ad-hoc.  The  goal  of  this  chapter  is  to  provide 
an  intuitive  analysis  that  explains  shortcomings  of  previously  proposed  solutions  and 
points  to  new  solutions  for  the  LE  and  the  DFE.  The  analysis  in  this  chapter  explains 
the  increase  in  mean  squared  error  of  the  equalized  signal  due  to  the  use  of  a  different 
number  of  coefficients  in  the  modeled  channel  than  in  the  true  channel.  The  results 
do  not  include  channel  estimation  error  which  will  further  increase  the  MSE.  An 
algorithm  is  presented  to  include  unmodeled  channel  coefficients  in  the  equalizer 
coefficient  calculations  which  reduces  the  output  MSE.  Experimental  data  is  used  to 
validate  the  proposed  algorithm. 

5.2  Assumptions 

To  streamline  the  analysis  and  emphasize  the  desired  effects,  three  common  assump¬ 
tions  about  the  transmitted  data  and  observation  noise  are  made  in  this  chapter: 

1.  The  transmitted  data  is  modeled  as  samples  of  a  discrete  white  random  process 
with  unit  variance.  For  a  length  M  transmitted  data  vector  d[n],  this  implies 
E{d[n]d^[n]}  =  IM. 

2.  The  observation  noise  and  the  transmitted  data  are  uncorrelated.  This  implies 
that  for  a  complex,  baseband  noise  vector  u'[n]  and  a  transmitted  data  vector 
d'[m],  E{u[n\dH[m\}  =  0. 

3.  The  estimates  of  past  data  are  assumed  to  be  perfect,  i.e.  d  =  d,  removing 
the  error  dependence  from  data  estimation  and  isolating  the  channel  length 
estimation  errors. 

4.  The  channel  estimate  is  assumed  to  be  perfect  for  the  specified  number  of  chan¬ 
nel  coefficients.  Only  errors  from  having  a  different  number  of  coefficients  in  the 
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model  than  in  the  true  channel  are  considered;  other  types  of  channel  estimation 
error  are  not  considered. 

5.  The  channel  impulse  response  is  assumed  to  be  of  finite  extent  (FIR). 

The  first  three  assumptions  are  quite  common  and  do  not  lessen  the  generality  or 
usefulness  of  the  derived  results.  The  fourth  assumption  is  made  to  focus  the  methods 
on  compensating  for  channel  length  mismatch.  The  last  assumption  is  reasonable 
given  observed  experimental  data.  For  the  remainder  of  this  chapter,  time  index  is 
dropped  for  notational  simplicity. 

The  channel  convolution  matrix,  G  described  in  Section  2.2  is  important  to  the 
derivations  in  this  chapter.  To  simplify  discussion,  the  rows  of  the  channel  convolution 
matrix  are  labeled  as 


G1  —  [g-Lc-jVc+2  •  •  •  So  •  •  •  SLa+Na]  i  (5.2.1) 

where  the  index  is  a  relative  offset  from  the  zero  column,  corresponding  to  the  data 
symbol  currently  being  estimated  (i.e.  do). 


5.3  Approach 

The  central  question  of  this  chapter  is  “How  does  an  incorrect  assumption  about 
the  number  of  channel  coefficients  affect  equalization?”  To  answer  this  question, 
the  equalizer  coefficients  computed  using  an  incorrect  channel  length  assumption  are 
compared  with  the  equalizer  coefficients  calculated  using  the  true  channel  length. 

As  discussed  in  Section  2.2,  convolution  can  be  written  as  a  matrix  multiplication. 
The  true  channel  convolution  matrix  can  be  split  into  the  sum  of  an  estimated  channel 
convolution  matrix  and  a  delta  offset, 

G  =  G  +  AG.  (5.3.1) 
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Recall  from  Section  2.4  that  the  linear  equalizer  coefficients  can  be  found  using 


hlin[n]  =  [Gff[n]G[n]  +  R„]-1go,  (5.3.2) 

and  the  DFE  equalizer  coefficients  can  be  found  using 

hff[n]  =  [Gq  [n]G0[n]  +  Rj,]_1go  (5.3.3) 

hfbN  =  -Gfo[n]hs[n\.  (5.3.4) 


Using  the  expression  from  eq.  (5.3.1)  with  (2.4.9)  and  (2.4.20),  the  effect  of  chan¬ 
nel  estimation  errors  is  observable  in  both  the  equalizer  coefficients  and  the  mean 
squared  error  (MSE).  This  formulation  is  not  specific  to  channel  length  error  esti¬ 
mation  and  can  be  generalized  to  other  types  of  errors,  such  as  wrongly  guessing  a 
sparsity  constraint.  The  focus  in  this  chapter  is  on  channel  length  errors  since  the 
analysis  points  to  a  tractable  solution. 


Notation  is  used  to  highlight  the  difference  between  quantities  computed  using 
the  true  channel  parameters  from  quantities  computed  using  a  estimated  or  assumed 
channel  parameters.  If  the  true  channel  has  N  coefficients,  the  channel  impulse 
response  is  g[n]  =  [g[n,  0],  g[n,  1],  ...  g[n,  N  —  1]]T.  Similarly,  if  the  estimated 
channel  has  length  M  the  estimated  channel  impulse-response  is 


g[n,  i 


g[n,i\ 

0 


i  <  N 

i  >  N,  when  M  >  N 


The  vector  g  is  defined  as 


(5.3.5) 


g[n]  =  [g[n,  0]  g[n,  1]  . . .  g[n,  M  -  1]]T  ,  (5.3.6) 


and  the  estimated  channel  convolution  matrix,  G  is  the  estimated  channel  vectors 

jj 

padded  with  zeros  so  that  the  result  is  the  same  when  multiplying  the  G  d  as  when 
convolving  the  estimated  channel  with  the  transmitted  data.  Now,  from  eq.  (5.3.1), 
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AG  is  computed  as  G  —  G,  where  G  will  have  rows  with  all  zero  elements  appended 
appropriately  so  the  two  matrices  are  the  same  dimension. 

Note  that  in  both  eq.  (2.4.9)  and  eq.  (2.4.20),  the  channel  convolution  matrix  ap¬ 
pears  as  a  product  with  itself  and  with  multiplication  by  a  selection  matrix.  Padding 
this  matrix  with  zero  columns  and  appropriately  modifying  the  selection  vector  so  it 
acts  on  the  same  column  of  G  does  not  change  the  result. 

In  the  case  of  overestimation  of  the  channel  length  AG  =  0  so  G  =  G.  Thus, 
under  the  described  conditions,  overestimating  the  channel  length  does  not  increase 
the  MSE  of  the  equalizer.  This  assumes  that  channel  impulse  response  coefficients 
are  perfectly  known  for  the  assumed  channel  length.  In  practice  the  expectation  is 
estimated  through  time  averaging  which  increases  the  MSE  [66] ,  but  these  effects  are 
not  taken  into  account  in  this  chapter.  Underestimating  the  channel  increases  the 
MSE  (even  with  perfect  channel  knowledge)  due  to  the  unmodelled  ISI.  The  next 
sections  explore  the  effects  of  assuming  too  short  of  a  channel  length. 


5.3.1  LE  analysis 

Change  in  equalizer  coefficients 

The  analysis  of  a  LE  is  started  by  first  rewriting  the  optimal  equalizer  coefficients  as 
an  estimate  plus  an  offset 

hiin  —  hiin  T  5hjin,  (5.3.7) 

where  hjin  are  the  coefficients  derived  from  using  the  estimated  channel,  G.  The 
expression  for  hun  is 

hiin  =  [G^G  +  Ri/]-1go-  (5.3.8) 

Using  eq.  (5.3.1),  this  equation  can  be  rewritten  as 

hlin  =  [(G  —  AG)f(G  —  AG)  +  R^]_1go 

=  [GHG-GHAG-AGHG  +  AGHAG  +  R^1g*0.  (5.3.9) 
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To  simplify  the  derivations,  the  following  quantities  are  defined 


Q  =  G^G  +  R„  (5.3.10) 

W  =  AG  +  AG^G  -  AG^AG  (5.3.11) 

So  =  So  —  ^So-  (5.3.12) 

Note  that  both  Q  and  W  are  Hermitian.  Using  these  substitutions,  the  equation 
for  the  coefficients  becomes  a  function  of  the  difference  between  the  terms  from  the 
true  channel,  Q,  and  the  error  terms,  W.  The  definition  of  W  includes  cross  terms 
between  the  true  channel  and  the  channel  estimation  error  that  have  been  missing 
from  previous  analysis  in  the  literature. 


Eq.  (5.3.9)  can  be  rewritten  as 

hlin  =  [Q  —  W]-1(g0  —  (5g0)* 

=  Q-^S  -  Q-^So  +  Q-1[I  -  WQ-1]-1WQ_1(g0  -  5gor 
=  hlin  -  Q-Mg*  +  Q— 1  [I  -  WQ“1]"1WQ"1(g0  -  SSo)*.  (5.3.13) 

The  second  equality  comes  from  applying  the  Woodbury  identity.  The  third  equality 
comes  the  expression  for  the  LE  coefficient  matrix  in  eq.  (2.4.9).  Rearranging  terms, 
the  form  of  the  perturbation  of  the  equalizer  coefficients  is  found  to  be 

5hlin  =  Q  X[I  -  WQ-t'WQ-V  -  Q"1^.  (5.3.14) 

The  term  5g0  is  a  result  of  using  a  longer  equalizer  than  channel  estimate.  If  the 
equalizer  is  the  same  length  or  less  than  the  estimated  channel,  and  the  only  channel 
estimation  errors  are  due  to  length  underestimation,  this  term  is  zero.  A  common 
engineering  practice  is  to  use  equalizers  that  are  shorter  than  the  channel  estimate 
delay  spread.  Following  this  practice,  <5g0  =  0  for  the  remainder  of  this  chapter.  The 
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final  form  of  the  change  in  the  equalizer  coefficients  is 


Shlm  =  Q  MI  -  WQ_1]_1WQ_1g0 

=  [Q-W]-1Whlin,  (5.3.15) 

i.e.  the  offset  of  the  equalizer  coefficients  from  the  MMSE  solution  is  a  linear  combi¬ 
nation  of  the  MMSE  solution  (equalizer  coefficients)  where  the  weighting  is  a  function 
of  the  channel  estimation  error. 

Change  in  MAE 

This  section  describes  how  the  channel  length  estimation  errors  change  the  minimum 
achievable  error  (MAE)  assuming  knowledge  of  the  channel  impulse-response  coef¬ 
ficient  values,  if  not  their  number.  The  first  step  is  to  calculate  the  mean  squared 
error  using  the  estimated  filter  coefficients.  The  MAE  term  is  then  isolated  using 
eq.  (2.4.12).  The  terms  which  are  not  included  in  the  MAE  are  referred  to  as  the 
excess  error. 

The  estimated  error  can  be  written  as 


eiin  =  h  u  —  d.  (5.3.16) 

and  then  the  expected  mean  squared  error  can  be  written  as 

E{|elin|2}  =  E{|hHu-d|2}  (5.3.17) 

=  E{h  uu^h  —  h  ud*  —  duHh  +  dd*}.  (5.3.18) 

Using  the  assumption  that  crj  =  E {dd*}  =  1  and  the  relations  from  eq.  (2.2.5)  and 
h  =  hlin  —  <5h,  eq.  (5.3.18)  simplifies  to 

E{|elin|2}  =  1  -  g^ [G^G  +  R„]-1go  +  6hH[GHG  +  R„]<5h.  (5.3.19) 

Notice  that  the  first  two  terms  are  the  MAE,  crllin.  Substituting  the  relation  from 
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eq.  (5.3.15)  into  eq.  (5.3.19)  produces  an  alternate  form  of  the  expected  MSE, 


E{|e,i„|2}  =  +  (Q-'[I  -  WQ-1]-1WQ-1g0),'QQ-1[I  -  WQ-fWQ-'g; 

=  <u„  +  hii„WH[I-WQ-1]-,,Q-1[I-WQ-1]-1Wh,ln.  (5.3.20) 

Note  that  the  matrix  Q  is  Hermitian  and  positive  definite. 

The  matrix 


A  =  W^[I  -  WQ^]-ffQ^[I  -  WQ1]  ^  (5.3.21) 

is  Hermitian  and  positive  scmidehnite,  so  the  quantity  hiinWAhiin  >  0.  This  last  term 
is  the  excess  error  introduced  by  underestimating  the  channel  length.  This  result  is 
similar  to  the  result  presented  in  [75],  except  here  there  is  an  apparent  structure  of 
the  excess  error. 

Interpretation  of  results 

So  far,  the  effect  of  underestimating  the  channel  length  on  the  linear  equalizer  coef¬ 
ficients  and  the  MAE  has  been  described.  This  analysis  has  assumed  the  use  of  the 
CEB  equalization  algorithm  since  there  is  no  concept  of  channel  length  included  in 
the  DA  equalization  algorithm. 

In  the  matrix  W  from  eq.  (5.3.11),  the  term  AGaAG  always  has  a  strong  di¬ 
agonal  component  equal  to  the  energy  (2-norm)  in  the  unmodelled  coefficients.  The 
regularization  term  proposed  by  Lee  and  Cox  [58]  can  be  viewed  as  a  very  crude 
approximation  to  this  cross  term.  Preisig  [75]  explicitly  estimates  this  term  but  does 
not  capture  the  cross  terms. 

The  cross  terms  G  AG  and  AGf,G  that  appear  in  the  W  matrix  provide  a 
measure  of  the  interaction  between  the  missing  coefficients  and  the  estimated  channel. 
Specifically,  if  the  channel  is  sparse  and  most  of  the  coefficients  are  clustered  together 
with  one  outlier,  the  cross  terms  indicate  how  much  the  outlier  will  interact  with  the 
channel.  When  there  are  at  least  La  +  Lc  zeros  (he.  number  of  zeros  equal  to  the 
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equalizer  length),  the  cross  terms  are  zero.  In  this  case,  the  only  remaining  term  in 
W  is  AGtfAG,  which  can  be  folded  into  the  noise  correlation  matrix,  as  in  [75]. 


Compensating  for  estimation  error 

Using  the  assumption  that  the  transmitted  data  is  white,  AG  can  be  estimated  using 
the  estimated  channel  convolution  matrix 


cun  =  G^d  -  u.  (5.3.22) 

Right  multiplying  both  sides  by  d  ,  where  d  is  an  estimated  transmitted  data  vector 
that  is  twice  the  length  of  the  equalizer  (assuming  all  estimates  are  correct)  and 
substituting  for  u  using  eq.  (2.2.5)  produces 

E{elind/H}  =  E{G"dd/H  -  (G^d  +  jz)d/H} 

=  (G7  -  (AG7))HI  -  G'Hl 

=  AG,h.  (5.3.23) 

In  these  equations,  G7  is  the  true  channel  convolution  matrix  with  a  length  of  twice 
the  length  of  the  equalizer  (length  of  d7)  and  AG7  is  the  offset  matrix  with  this  same 
length  parameter.  The  second  equality  follows  from  eq.  (5.3.1). 

The  expression  for  the  MSE  from  Preisig’s  work  [75]  has  a  form  which  is  missing 
the  cross  terms, 


E{|elin|2}  =  E{|G^d  —  G^d  —  v\2} 

=  E{|(G  —  AG)Hd  —  G^d  —  iz|2} 

=  AG^AG  +  R,.  (5.3.24) 

In  this  formulation,  there  is  no  way  to  differentiate  the  channel  offset  term,  AG  from 
the  noise  correlation  matrix. 
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Using  the  estimated  offset,  AG7,  the  linear  equalizer  coefficients  are 

hlin  =  [E{elin'elin'H}  +  AG,ffAG'  +  G^G  +  AG,HG  +  G^AG']-1^  (5.3.25) 

where  =  eun  —  AG,ad'  =  AG'^d'  +  v  is  the  modified  observed  error  with 
AG"  defined  as  the  part  of  the  offset  not  estimated  using  the  proposed  method, 
AG  =  AG'  +  AG".  The  term  E{eiin'eiin/if}  represents  an  estimate  of  the  effective 
noise  correlation  matrix,  introduced  in  Chapter  3.  Evaluating  this  expectation  (across 
the  data  symbols  and  noise)  and  combining  terms,  eq.  (5.3.25)  becomes 

hlin  =  [R„  +  AG/,hAG"  +  G  G  ]-%,  (5.3.26) 

where  G  =  G  +  AG'  is  the  original  estimated  channel  plus  an  estimate  of  the  next 
La  +  Lc  coefficients,  i.e.  a  longer  channel  estimate.  The  effective  noise  due  to  channel 
length  estimation  errors  is  reduced  since  previously  un-modeled  channel  coefficients 
are  included  in  the  channel  model  of  the  above  expression.  If  the  system  designer 
believes  there  is  still  significant  energy  in  the  true  channel  not  included  in  channel 
model,  the  above  procedure  can  be  repeated,  extending  the  channel  model  further. 
The  estimates  of  AG  used  in  this  formulation  are  noisy,  so  the  reduction  in  the 
effective  noise  will  not  be  as  much  as  shown. 

The  effective  noise  correlation  matrix  is  calculated  using  the  difference  between 
the  received  data  and  the  received  data  estimate.  Using  the  proposed  method  for 
increasing  the  effective  length  of  the  channel  model  does  not  change  this  calculation, 
so  the  cross-terms,  AG'^AG"  and  AG'^AG',  are  still  missing  in  eq.  (5.3.26).  The 
expected  MSE  of  the  equalizer,  however,  is  reduced  by  extending  the  channel  model 
since  the  excess  MSE  is  proportional  to  the  modeling  error. 

5.3.2  DFE  analysis 

The  analysis  for  the  DFE  closely  mirrors  that  of  the  LE  and  will  follow  a  similar  line  of 
reasoning.  The  change  in  equalizer  coefficients  due  to  channel  length  estimation  errors 
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is  presented  first,  followed  by  MAE  analysis,  discussion,  and  methods  for  channel 
compensation. 


Changes  in  equalizer  coefficients 

A  useful  method  for  analyzing  the  excess  error  when  computing  the  equalizer  co¬ 
efficients  is  to  write  the  optimal  equalizer  coefficients  as  the  sum  of  the  estimated 
coefficients  and  an  offset, 


hfr  =  hff  +  <5hff  (5.3.27) 

hfb  =  hfb  +  cffifb-  (5.3.28) 

The  DFE  feedforward  filter  derivation  is  almost  identical  to  the  LE  derivation,  where 
G  and  hun  from  the  LE  derivation  are  replaced  by  Go  and  hg  in  the  DFE  derivation. 
The  feed-forward  estimated  filter  coefficients  have  the  same  form  as  the  LE  coefficients 
eq.  (5.3.9) 

hff  =  [G^Go  -  G^AGo  -  AG^Go  -  AG^AG0  +  (5.3.29) 

Following  this  logic,  the  change  in  the  DFE  feedforward  coefficients  is  similar  to  the 

change  in  the  LE  coefficients  from  eq.  (5.3.15).  In  the  case  of  the  DFE,  the  change  is 
written  as 

<5hff  =  Q,— 1  [I  -  W,Q/_1]_1W/Q,_1go,  (5.3.30) 

where  Q7  =  [G^G0  +  R„]  and  W7  =  G^AG0  +  AG^G0  +  AGf  AG0  where  we  have 
split  AG  into  the  same  sections  as  we  split  G  previously.  Including  the  perturbation 
into  the  feedback  section  is  written  as 

hfb  =  — Gfbhff 

-  — (Gfb  —  AGfb)(hff  —  5hff) 

=  —Gfbhff  +  GftAhff  —  AGfb(hff  —  <5hff).  (5.3.31) 
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Subtracting  off  from  both  sides,  we  get  the  change  is 


5hfb  —  AGfb(hff)  —  Gfb5hff. 


(5.3.32) 


Change  in  MAE 

Starting  with  the  estimated  error, 


^  '-H 

edfe  =  hfr  u  +  h„,d 


fbufb, 


(5.3.33) 


the  squared  error  becomes 


E{|edfe|2}  =  E{|hffu  +  hfbdfb  -  d\2}.  (5.3.34) 

Expanding  and  substituting  into  this  equation  similar  to  LE  analysis,  one  arrives 
at  the  solution  for  the  MSE, 

E{|edfe|2}  =al4fe  +  hg  (AG£ AGft)hff  +  5hf  [G0HG0  +  R^Shg 
E{|edfe|2}  =al4fe  +  K(AG^AGft)hff+ 

hff^W/H[I  -  W/Q/_1]_'f/Q/_1[I  -  W,Q/_1]_1W/hff.  (5.3.35) 

This  expression  has  an  additional  term  which  accounts  for  the  error  in  estimating 
the  feedback  portion  of  the  channel.  Both  excess  error  terms  are  non-negative  again. 
The  extra  error  term  is  the  energy  from  the  error  in  estimating  the  feedback  filter 
coefficients. 

Discussion 

Much  of  the  discussion  for  the  LE  still  holds  true  for  the  DFE.  One  key  difference  is  the 
additional  error  term  in  the  DFE  excess  error,  hff  (AG^AGfb)hff.  This  term  shows 
that  estimation  errors  in  the  feedback  portion  of  the  channel  show  up  as  squared  terms 
in  the  excess  error.  The  excess  error  from  the  feedforward  section  is  uncorrelated  with 
the  excess  error  in  the  feedback  section.  So  even  with  perfect  feedforward  coefficient 
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estimation  there  will  still  be  excess  error  in  the  DFE 

Another  difference  between  the  DFE  and  the  LE  is  how  the  structure  of  AG 
enters  into  the  error  equation.  There  is  no  longer  a  strong  diagonal  term,  so  one 
would  expect  regularization  methods  [75,  58]  will  not  reduce  the  MSE  in  the  case  of 
the  DFE. 


Compensating  for  estimation  error 

Estimating  the  DFE  offset  term  AG7  follows  a  similar  derivation  to  the  LE  case. 
This  is  because  estimating  the  error  introduced  by  the  channel  estimate  has  nothing 
to  do  with  the  equalizer  structure.  Therefore  the  procedure  for  the  DFE  is  exactly 
the  same  as  for  the  LE,  with  only  the  labels  changed, 


^dfe 

=  G  d  —  u 

(5.3.36) 

E{edfed/H} 

=  AG,h 

(5.3.37) 

Cdfe/ 

=  edfe  -  AG,ffd'. 

(5.3.38) 

Once  the  channel  estimate  is  computed,  the  channel  convolution  matrix  is  split 
into  the  feedforward  and  feedback  sections,  AG'0  and  AGft/  respectively.  Substitut¬ 
ing  the  channel  offset  into  the  equalizer  coefficient  equations,  the  feedforward  and 
feedback  portions  of  the  equalizer  equations  become 


hfr  —  +  G0  G0]  lg*0 

h  fb  =  —  [Gfb  +  AG^jhjf.  (5.3.39) 

The  feed-forward  section  is  the  feedforward  coefficients  as  if  a  longer  channel  model 
(by  La  +  Lc  coefficients)  was  used  originally,  with  a  new  effective  noise  matrix  which 
includes  the  channel  coefficients  which  are  still  not  modeled.  The  estimate  of  the  effec¬ 
tive  noise  correlation  matrix,  R„,  =  Rj/+AG,,wAGm,  is  the  same  effective  correlation 
matrix  as  for  the  LE.  This  estimate  will  be  exact  if  the  term  AG,,a  AG'  +  AG,a  AG' 
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is  zero  ( i.e .  there  is  no  correlation  between  channel  coefficients  not  modeled  and  the 
extended  model).  The  estimated  feedback  coefficients  include  the  channel  length  ex¬ 
tension.  There  is  additional  noise  introduced  into  the  estimate  (not  explicitly  modeled 
here)  due  to  the  errors  in  estimating  AG7. 


5.4  Performance  analysis  using  simulations 

This  section  provides  some  simple  examples  to  validate  these  results.  These  sim¬ 
ulations  were  executed  in  MATLAB  and  the  results  are  presented  in  the  following 
sections.  All  of  the  simulations  use  an  EW-RLS  algorithm.  This  method  is  practical 
and  approximately  equals  the  MMSE  solution  for  time-invariant  channels  [36,  86]. 

Often  the  noise  statistics  are  often  not  known,  so  the  implementation  of  the  cor¬ 
rection  algorithms  involves  using  an  exponential  weighting  algorithm  to  estimate  the 
ensemble  correlation  matrices.  This  method  is  preferred  to  a  running  average  since 
the  channel  may  not  be  time-invariant,  but  is  assumed  to  be  slowly  varying.  To  il¬ 
lustrate  this  method,  assume  quantity  to  be  estimated  is  E{6}.  The  observations  are 
denoted  as  b[n]. 

1  -  A  n 

(5A1) 

i=0 

Only  the  DFE  is  shown  in  simulation  since  the  LE  is  a  special  case  of  the  DFE 
when  the  feedback  filter  has  zero  coefficients. 


5.4.1  Time- invariant  channel 

For  the  time-invariant  channel  the  channel  coefficients  are  constant  for  all  time.  The 
channel  time  index  is  dropped  in  this  subsection  for  brevity,  so  the  ith  time-invariant 
channel  coefficient  is  represented  as  g[i ]  =  g[n,  i]  for  all  n. 

One  example  to  illustrate  the  induced  error  term  caused  by  using  a  lower  channel 
model  order  than  the  true  channel  order  is  a  true  channel  with  length  3  and  a  modeled 
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channel  with  length  2.  The  estimated  channel  convolution  matrix  has  the  form 


G 


g[ i]  g[ o]  o 

0  g[l]  g[ 0] 


and  the  offset  matrix  has  the  form 


AG 


9(2]  0  0  0 
0  9(2]  0  0 


(5.4.2) 


(5.4.3) 


Since  there  is  only  one  channel  coefficient  not  included  in  the  model,  the  term 
AG^AG  is  equal  to 

AGhAG  =  \g[2]\2l 


The  results  for  a  length  7,  randomly  chosen  stationary  channel  are  shown  in  Fig¬ 
ures  5.4.1  and  5.4.2  (DFE).  Figure  5.4.1  is  for  a  7-coefficient  linear  equalizer  and 
figure  5.4.2  is  from  a  DFE  with  7  feed-forward  coefficients  and  6  feedback  coefficients. 
The  transmitted  data  packets  were  60,000  4-QAM  symbols  long  and  all  results  as¬ 
sume  perfect  symbol  estimation  (no  data  estimation  errors).  The  data  estimates  are 
assumed  to  be  perfect  to  confine  the  observed  error  to  the  channel  length  estimation 
errors. 

The  results  illustrate  that  the  proposed  method  for  correcting  the  CEB  equalizer 
works  and  is  approximately  equal  to  the  DA  approach.  Also,  while  the  LE  does  worse 
than  the  DFE  in  for  all  SNR.  in  terms  of  MSE  and  BER,  the  difference  between  the 
regularization  approaches  and  the  bias  estimated  approach  is  less  for  the  LE  than  the 
DFE  as  expected  by  analysis  because  the  previously  proposed  methods  include  only 
AG'9  AG  and  not  the  cross  terms. 


5.4.2  Rayleigh-fading  channel 

The  second  case  of  interest  is  a  non-stationary  channel.  Again  the  channel  impulse- 
response  length  is  assumed  to  be  underestimated  by  one  coefficient.  The  analysis  is 
very  similar  to  the  time-invariant  channel,  except  this  example  illustrates  that  the 
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BER  Equalizer  Comparison  on  a  time-invariant  7-tap  channel 


SDE  Equalizer  Comparison  on  a  time-invariant  7-tap  channel 


Figure  5.4.1:  (a)  BER  and  (b)  MSE  comparison  of  different  LE  approaches  for  a 
simulated  7-coefficient  stationary  channel.  The  approaches  include  DA,  CE  error- 
estimated  (Preisig  [75]),  CEB  bias  compensated,  CEB  regularized  (Lee  and  Cox  [58]), 
and  optimal  where  perfect  channel  knowledge  is  assumed  (no  channel  length  estima¬ 
tion  error). 
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BER  Equalizer  Comparison  on  a  time-invariant  7-tap  channel 


SDE  Equalizer  Comparison  on  a  time-invariant  7-tap  channel 


Figure  5.4.2:  (a)  BER  and  (b)  MSE  comparison  of  different  DFE  approaches  for 
a  simulated  7-coefficient  stationary  channel.  The  approaches  include  DA,  CE  error- 
estimated  (Preisig  [75]),  CEB  bias  compensated,  CEB  regularized  (Lee  and  Cox  [58]), 
and  optimal  where  perfect  channel  knowledge  is  assumed  (no  channel  length  estima¬ 
tion  error). 
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BER  Equalizer  Comparison  on  a  4-tap  Rayleigh  Fading  channel 


SDE  Equalizer  Comparison  on  a  4-tap  Rayleigh  Fading  Channel 


Figure  5.4.3:  (a)  BER  and  (b)  MSE  comparison  of  different  DFE  approaches  for 
a  simulated  7-coefficient  Rayleigh  channel.  The  approaches  include  DA,  CE  error- 
estimated  (Preisig),  CEB  bias  compensated,  CEB  regularized  (Lee  and  Cox),  and 
optimal  where  perfect  channel  knowledge  is  assumed  (no  channel  length  estimation 
error) . 


proposed  methods  are  robust  when  there  is  channel  motion. 

The  simulation  was  for  a  length  4,  Rayleigh  fading  channel,  where  each  coefficient 
fades  independently.  The  coherence  time  (inverse  of  the  Doppler  spread)  of  the  chan¬ 
nel  was  one  second,  and  each  coefficient  had  equal  energy  (variance).  The  sampling 
rate  was  2400  samples  per  second  and  the  data  packet  was  60000  4-QAM  modulated 
symbols. 

The  results  again  confirm  that  the  proposed  method  outperforms  other  CEB  equal¬ 
ization  methods,  but  the  DA  method  outperforms  them  all.  One  unexpected  result 
was  that  the  regularization  method  proposed  by  Lee  et  al.  [58]  had  an  increasing  MSE 
as  the  SNR  increased.  This  model  includes  time-variation  which  leads  so  there  are 
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errors  both  clue  channel  estimation  errors  and  errors  due  to  the  channel  model  having 
fewer  coefficients  than  the  true  channel.  The  regularization  parameter  proposed  by 
Lee  et  al.  does  not  depend  on  SNR  and  so  the  regularization  parameter  introduces  an 
additional  modeling  error  between  the  MMSE  regularization  parameter  (the  inverse 
SNR  for  white  noise)  and  the  modeled  regularization  parameter,  which  is  exacerbated 
as  the  SNR  is  increased. 

The  DA  approach  now  clearly  outperforms  all  the  CEB  approaches,  but  the  pro¬ 
posed  bias  corrected  CEB  approach  performs  better  than  any  previously  proposed 
method.  Also,  there  is  now  a  large  gap  between  the  optimal  equalizer  which  has 
perfect  channel  knowledge  and  the  estimated  equalizers  which  estimate  the  channel 
coefficients.  This  gap  did  not  appear  in  the  time-invariant  channel  examples  and  is 
likely  due  to  channel  estimation  errors  not  related  to  underestimating  the  number  of 
channel  coefficients. 


5.5  Experimental  evidence 

5.5.1  RACE08  -  experimental  setup 

The  data  presented  is  from  the  Reschedule  Acoustic  Communication  Experiment 
(RACE08)  which  took  place  in  Narragansett  Bay  at  the  LIniversity  of  Rhode  Island’s 
Narragansett  Bay  Campus  from  March  1-March  17,  2008.  The  data  presented  was 
transmitted  using  an  ITC-1007  spherical  transducer  with  a  resonant  frequency  of 
approximately  11  kHz  and  a  bandwidth  of  approximately  10  KHz.  The  receiver  was 
a  12  element  vertical  array  with  12  cm  spacing  located  approximately  1000  m  in  a 
direction  of  120°  from  the  transducer. 

The  transmitter  and  receivers  used  a  sampling  frequency  of  fs  =  39062.5  samples 
per  second.  There  was  an  anti-aliasing  filter  at  the  receivers  with  a  cut-off  of  about 

18.5  KHz. 

The  water  was  approximately  a  constant  10  m  depth  from  transmitter  to  receiver. 
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The  data  presented  were  recorded  on  the  last  day  of  the  experiment  in  the  11  AM 
data  cycle.  The  conditions  were  fairly  calm  with  some  small  waves  and  light  wind. 

The  signals  were  BPSK  modulated,  LDPC  encoded  data  packets  with  25,000 
symbols  per  packet  transmitted  at  a  rate  of  6  samples  per  symbol  or  approximately 
6510  symbols  per  second.  A  Hamming  window  was  employed  to  reduce  side-lobe 
effects.  The  carrier  frequency  was  12  kHz  to  be  near  the  resonance  of  the  transducer. 

Before  equalization,  the  data  was  low-pass  filtered,  transfered  to  baseband,  and 
down-sampled  to  two  samples  per  symbol.  Time  alignment  of  the  signal  was  achieved 
through  the  use  of  an  M-sequence  timing  signal  at  the  beginning  of  the  packet. 


5.5.2  Experimental  results 

The  channel  was  modeled  as  having  8  coefficients  (6  causal  and  2  a  causal  coefficients) 
clustered  around  the  direct-path  arrival.  The  equalizer  was  an  8  feed-forward  coeffi¬ 
cient  (3  causal  and  5  acausal),  5  feedback  coefficient  DFE.  This  structure  was  chosen 
to  capture  the  width  of  the  main  arrival  and  to  cancel  out  nearby  interference. 

The  three  structures  studied  were  the  DA,  the  proposed  bias  compensating  CEB, 
and  the  bias  estimating  CEB.  The  regularized  model  was  not  examined  since  the 
error-estimating  equalizer  was  always  shown  to  perform  better. 

Figure  5.5.1  shows  the  estimated  channel  using  an  RLS  channel  estimator.  This 
estimated  channel  is  much  wider  than  the  CEB  equalizers  estimate  so  we  can  see  the 
structure  of  the  channel,  ft  appears  as  if  most  of  the  energy  arrives  with  the  direct 
arrival,  although  there  is  an  anomalous  acausal  arrival.  There  is  also  some  structure 
and  motion  in  the  causal  part  of  the  channel  that  is  probably  due  to  wave  motion, 
but  these  arrivals  are  very  weak. 

Figure  5.5.2  shows  that  DA  equalizer  outperforms  both  the  CEB  approaches, 
but  that  the  bias  compensated  CEB  equalizer  does  better  than  the  error-estimating. 
This  is  nice  empirical  validation  that  the  bias  compensating  method  may  be  worth 
deploying  on  a  real  world  system  if  there  are  other  reason  to  use  a  CEB  equalizer. 
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Channel  Estimate  for  RACE08  Data  Packet 


0.6  0.8  1  1.2  1.4  1.6  1.8  2  2.2 

Time  (s) 


Figure  5.5.1:  Channel  estimate  for  the  observed  data  packet.  Fairly  calm  conditions 
with  little  channel  spread. 

5.6  Discussion 

This  chapter  presented  a  clear  look  at  the  effect  of  using  incorrect  estimates  of  the 
channel  length  on  equalization.  Inaccurate  channel  length  information  will  only  affect 
CEB  equalization  since  the  DA  equalizer  algorithm  has  no  notion  of  channel  state 
information.  Analysis  was  presented  to  illustrate  that  only  under-estimating  the 
channel  length  effects  the  MSE  of  the  output  of  the  equalizer. 

Simulation  and  experimental  data  conErmed  that  an  under-estimation  of  the  chan¬ 
nel  length  does  negatively  effect  performance  and  in  this  case,  the  DA  algorithm  out¬ 
performs  the  CEB.  A  method  for  recovering  the  missing  channel  information  through 
post-processing  of  the  information  was  proposed  and  analyzed.  Even  when  including 
this  additional  information,  the  DA  has  the  lowest  MSE  due  to  channel  estimation 
error. 
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SNR  vs.  SDE  equalizer  comparison  from  RACE08  data 


Figure  5.5.2:  (a)  BER  and  (b)  MSE  comparison  of  different  DFE  equalizer  approaches. 
The  three  approaches  included  are  DA,  CEB  bias  compensated,  and  the  CEB  error 
estimated  proposed  by  Preisig  [75].  The  approach  proposed  by  Lee  et  al.  [58]  is  not 
included  because  the  previous  results  showed  that  it  was  not  an  optimal  approach, 
i.e.  the  average  error  increased  with  SNR. 
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Chapter  6 


Comparing  techniques  for 
computing  equalizer  coefficients 

6.1  Introduction 

Decision  feedback  equalization  has  been  used  for  many  years  to  improve  bit-rate 
and  reliability  of  underwater  communication  [105].  Coefficients  for  an  equalizer  can 
be  estimated  either  directly  from  the  received  data  (Direct  Adaptation)  or  from  a 
channel  estimate  (Channel  Estimate  Based).  The  CEB-DFE  has  better  performance 
(i.e.  lower  residual  estimation  error)  while  the  DA-DFE  has  lower  computational 
complexity. 

The  CEB  equalizer  outperforms  the  DA  equalizer  when  comparing  MSE  after 
equalization  [91,  131].  The  question  remains,  “Why  there  is  a  performance  differ¬ 
ence  between  equalizers  build  using  the  DA  method  and  those  build  using  the  CEB 
method?”  This  chapter  examines  the  reasons  for  the  performance  difference  between 
these  two  equalizer  coefficient  methods  and  compares  the  methods  when  used  in  an 
underwater  acoustic  communication  system. 

A  key  conceptual  difference  between  the  DA-DFE  and  CEB-DFE  is  the  coefficients 
being  tracked,  i.e.  the  coefficients  that  are  adaptively  estimated.  For  the  CEB-DFE, 
the  channel  coefficients  are  estimated  from  the  received  data;  for  the  DA-DFE,  the 
equalizer  coefficients  are  being  estimated  from  the  received  data. 
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The  CEB-DFE  tracks  the  physical  channel,  a  parameter  that  is  much  easier  to 
conceptualize.  A  particular  shape  of  the  channel  impulse  response  implies  something 
about  the  physical  world  that  can  be  observed  through  other  means.  For  instance,  if 
there  is  a  strong  channel  path  at  a  certain  delay,  then  there  must  be  some  reflector, 
such  as  a  boat  or  a  fish,  that  one  could  go  find.  The  dynamics  of  the  channel  are  thus 
related  to  a  physical  model  of  the  world. 

The  DA-DFE  tracks  the  equalizer  coefficients,  a  more  abstract  quantity.  The 
coefficients  satisfy  an  optimality  criterion,  but  are  not  parameters  that  necessarily 
relate  directly  to  the  physical  world.  The  dynamics  of  the  equalizer  coefficients  and 
thus  of  the  DA-DFE  are  much  more  convoluted  and  are  only  related  to  the  channel 
coefficients  through  a  non-linear  function. 

Two  theories  for  the  performance  difference  between  the  DA-DFE  and  the  CEB- 
DFE  have  evolved  in  the  literature: 

1.  The  received  data  correlation  matrix  is  ill-conditioned  and  thus  the  DA  algo¬ 
rithm  is  limited  by  numerical  error.  [131] 

2.  The  channel  estimate  based  equalizer  requires  fewer  samples  to  fully  characterize 
the  channel.  [91] 

This  chapter  shows  that  neither  of  these  two  hypothesis  are  entirely  correct.  In¬ 
stead,  the  evidence  indicates  that  the  performance  difference  is  related  to  the  coher¬ 
ence  time  of  the  equalizer  coefficients  and  the  channel  impulse  response  coefficients: 
the  channel  coefficients  have  a  longer  coherence  time  than  the  equalizer  coefficients, 
so  the  CEB  method  which  relies  on  the  channel  estimate  can  be  estimated  over  more 
samples  than  can  DA  equalizer  coefficients.  Thus  the  estimation  noise  is  higher  in 
the  DA  method  than  the  CEB  method.  This  result  is  verified  using  simulations  of 
time- varying  channels. 

In  the  next  section  of  this  chapter,  two  models  of  channel  coefficient  correlation  are 
described:  the  Markov  model  and  the  Gaussian  model.  In  Section  6.3  the  structure 
of  the  channel  convolution  matrix  is  briefly  described  including  a  description  of  what 
each  column  and  row  position  signifies.  Section  6.4  describes  the  explicit  dependence 
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the  equalizer  coefficients  have  on  the  channel  coefficients  and  how  changes  in  chan¬ 
nel  coefficients  propagate  through  the  equalizer  coefficient  formulation.  Section  6.5 
presents  a  linear  perturbation  model  to  simplify  the  nonlinear  relationship  between 
the  equalizer  and  channel  coefficients.  Sections  6.6  and  6.7  describe  the  correlation 
of  the  equalizer  coefficients  and  show  the  correlation  empirically  through  simulation 
results.  Section  6.8  is  a  discussion  of  the  chapter  results. 


6.2  Channel  coefficient  correlation  models 

An  assumption  that  the  channel  impulse  response  is  slowly  varying  (compared  with 
the  averaging  window)  is  common  when  estimating  the  channel  impulse  response  co¬ 
efficients  (or  the  equalizer  coefficients).  A  reasonable  question  to  ask  at  this  point  is 
what  does  the  term  slowly  varying  actually  mean.  This  section  presents  two  channel 
coefficient  correlation  models:  Markov  correlation  (also  known  as  auto-recursive  cor¬ 
relation)  where  the  correlation  falls  off  exponentially  and  Gaussian  correlation  which 
has  a  bell-curve  shaped  correlation  function. 

6.2.1  Markov  correlation  model 

In  a  Markov  channel  model,  also  known  as  an  auto-recursive  channel  model,  all  statis¬ 
tical  information  from  past  channel  coefficients  is  contained  in  the  current  coefficient. 
This  model  has  been  previously  shown  to  be  a  useful  description  of  the  underwater 
channel  [30].  Under  the  Markov  model,  the  ith  channel  coefficient  is  modeled  using 
the  relation 

g[n  +  l,i]  =  ag[n,i\  +  ug[n,i],  (6.2.1) 

where  a:  is  a  complex  scaling  parameter  (also  known  as  the  auto-recursive  parameter), 
v[n,i\  is  zero-mean,  Gaussian  white  noise  with  variance  cqG,  and  g[n,i\  is  the  ith 
channel  coefficient  at  time  n.  The  scaling  parameter  is  bounded  such  that  |a|  <  1  so 
the  channel  coefficient  is  bounded,  i.e.  has  finite  energy.  For  simplicity,  the  remainder 
of  this  section  assumes  a  is  real  and  positive,  but  the  expressions  can  be  modified 
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Figure  6.2.1:  Correlation  function  for  Markov  channel  model  with  a  =  0.99  and 
crt,  j  =  0.0199. 

to  handle  a  complex  a  in  a  straightforward  manor.  Often  the  channel  is  modeled  as 
slowly  varying  in  time,  which  implies  a  ~  1. 

The  correlation  function,  Rgi  [m\ ,  of  the  ith  channel  coefficient  when  using  a 
Markov  model  is 

(  a\m\  \ 

Rgti[m\  =  E{g[n,i\g*[n  +  m,i\}  =  L  _  q2  J  .  (6.2.2) 

Figure  6.2.1  shows  the  correlation  function  with  a  =  0.99  and  av,t  =  0.0199. 

The  quantity  Nwin  is  dehned  as  twice  the  number  of  time-steps  before  the  correla¬ 
tion  function  is  scaled  by  e-1,  i.e.  twice  the  number  of  time  steps  m  until  R,gtl  [rn]  |  = 
e~1Rgti[ 0].  The  scaling  parameter,  a  can  be  expressed  in  terms  of  Nw[n  through  the 
relation 

a  =  e-2/JVwin.  (6.2.3) 

The  ith  channel  coefficient  energy  or  variance  equals  the  correlation  function  at 
zero  lag,  i.e.  a2g  i  =  _Rffii[0],  so  the  process  noise  variance  of  the  ith  channel  coefficient 

is 

=  -«2)-  (6-2.4) 

With  a  specihed  correlation  window  length  and  channel  coefficient  energy,  all 
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parameters  of  the  channel  correlation  model  are  determined.  Sometimes  the  fade- 
rate,  fr,  and  sampling  frequency,  fs ,  are  given  rather  than  the  correlation  window 
length.  In  this  case,  the  parameters  can  be  determined  using  the  relation 

iVwin  =  fs/ (2/r).  (6.2.5) 

The  fade-rate  is  defined  as  the  width  of  the  power  spectral  density  (PSD)  of  fading 
process.  The  fade-rate  is  also  referred  to  as  the  Doppler  spread  of  the  channel.  The 
reciprocal  of  the  fade-rate  is  the  coherence-time  of  the  channel.  In  this  thesis,  the 
coherence-time  is  defined  as  the  lag  at  which  the  correlation  function  is  scaled  by  e-1, 
i.e.  the  lag  m  at  which  RfJ  l [m]  =  e~lRg^[ 0]  (same  as  [91]). 


6.2.2  Gaussian  correlation  model 

The  shape  of  the  Gaussian  model  correlation  function  is  more  rounded  and  less  peaked 
than  the  Markov  model.  The  Gaussian  correlation  model  has  a  correlation  function 
that  is  a  Gaussian  function, 


R5,iM  =  E{g[n,i\g*[n  +  m,i]}  =  a2gie  , 


(6.2.6) 


where 


0  = 


fs 


2rcfr 


(6.2.7) 


The  correlation  function  with  crgi  =  1  and  ^  is  shown  in  Figure  6.2.2.  Note 
that  this  function  has  lower  tails  than  the  Markov  model. 

When  simulating  a  channel  with  a  Gaussian  correlation  function,  zero-mean,  unit- 
variance,  Gaussian  white  noise  is  convolved  with  a  Gaussian  function,  her  [n]  •  The 
function  coefficients  are  defined  as 


/igfW  = 


ali 


: e  2^2 


Vra. 


(6.2.8) 


There  are  several  notable  features  about  the  Gaussian  correlation  function.  The 
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Figure  6.2.2:  Correlation  function  for  Gaussian  channel  model  with  a2  —  1  and 

P  V2- 

first  is  that  the  Fourier  transform  of  a  Gaussian  function  is  also  a  Gaussian  function, 
so  the  PSD  of  the  channel  impulse  response  coefficients  will  also  have  a  Gaussian 
shape.  The  lag  m  at  which  the  correlation  function  is  reduced  by  a  scale  factor  of 
e-1,  i.e.  |-R9)j[m]|  =  e_1i?3ij[0]>  is  m=  \/2  •  f3  (see  eq.  (6.2.6)). 


6.3  Structure  of  the  channel  convolution  matrix 

Recall  that  the  received  signal  is  modeled  as 

u[n]  =  GH[n]d'[n]  +  u[n],  (6.3.1) 

where  u[n]  is  a  vector  of  received  data,  d7 [n  is  a  vector  of  transmitted  data,  u[n]  is  a 
noise  vector,  and  G[n]  is  the  channel  convolution  matrix  (described  in  Section  2.2). 
If  the  channel  is  time  invariant  and  the  received  data  is  sampled  once  per  symbol, 
the  channel  convolution  matrix  is  a  constant  Toeplitz  matrix.  If  the  signal  is  sampled 
more  than  once  per  symbol  (known  as  fractionally-spaced  sampling),  the  matrix  is 
no  longer  strictly  Toeplitz,  but  still  has  a  Toeplitz-like  structure.  Symbol-spaced 
sampling  is  assumed  for  the  remainder  of  the  chapter  due  to  notationally  simplicity, 
but  the  derived  results  apply  when  fractionally-spaced  sampling  is  used. 
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Each  position  of  the  channel  convolution  matrix  corresponds  to  a  different  time 
and  delay  of  the  time-varying  channel.  Recall  from  Section  2.2  that  the  (complete) 
channel  convolution  matrix  can  be  separated  into  a  feedback  portion,  Gfb,  and  a 
remainder  portion,  Go,  referred  to  as  the  reduced  channel  convolution  matrix.  The 
received  signal  sample  vector  in  eq.  (2.2.5)  can  be  rewritten  as 


u[n 


G  g[n]  G  *[ri 


dfb  [n\ 

d0[n] 


+  v  [n  . 


(6.3.2) 


This  equation  implies  that  column  position  corresponds  to  the  (zero-padded)  chan¬ 
nel  realization  and  the  row  position  indicates  which  input  data  symbol  the  channel 
coefficient  is  multiplied. 


6.4  Models  of  time-varying  equalizer  coefficients 

Equalizer  coefficients  are  functions  of  the  channel  coefficients.  Therefore,  when  the 
channel  coefficients  vary  with  time,  the  equalizer  coefficients  are  also  time-varying. 
The  relation  between  channel  coefficients  and  equalizer  is  highly  non-linear,  so  how 
the  equalizer  coefficients  are  affected  by  a  channel  coefficient  perturbation  is  not  clear. 
In  this  section,  three  equalizer  coefficient  models  are  proposed  which  explicitly  include 
channel  perturbations. 

6.4.1  General  channel  variation  model 

To  determine  the  response  of  the  equalizer  coefficients  to  a  perturbation  in  the  chan¬ 
nel,  a  reasonable  first  step  is  to  apply  a  perturbation,  AG  to  the  channel  convolution 
matrix  and  analyze  the  results.  The  perturbation  is  the  change  in  the  channel  con¬ 
volution  matrix  from  one  time  step  to  the  next, 

G  [n  +  1]  =  G  [n]  +  AG[«],  (6.4.1) 
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Substituting  this  relation  into  eq.  (2.4.20),  the  expression  for  the  DFE  coefficients 
becomes 

hff [n  +  1]  =  (G"  [n  +  1] G0 [n  +  1]  +  R^)  1go[n  +  1] 

=  (G"[n]G0[n]  +  R„  +  AGH[n]G[n]  +  Gff[n]AG[n]  +  AGff[n]AG[n])_1 

x  (g5fa] +5g*[n])  (6-4.2) 

hfb  [n  +  1]  =  —  Gfb[n  +  l]hff[n  +  1] 

=  —  (Gfb[n]  +  AGfb[n])hff[n  +  1],  (6.4.3) 

where  Ag[n]  is  the  row  of  AG[n]  in  the  same  row  position  as  g0  is  in  G  and  AG  is 
a  matrix  made  up  of  the  same  row  positions  as  Gfb-  Using  the  substitutions 

Q[n]  =  G"[n]G0[n]  +  R„ 

W  [n\  =  AG"  [n]  G  [n]  +  G"  [n]  AG  [n]  +  AG"  [n]  AG  [n] , 

equation  (6.4.2)  can  be  rewritten  using  the  matrix  inversion  lemma  [44]  as 

hff  [n  +  1]  =  (I  +  Q-1[n]W[n])-1(hff[n]  +  Q-1[n]Ag*[n]).  (6.4.4) 

This  type  of  analysis  was  used  in  Chapter  5  to  examine  the  effect  of  channel  model 
order  mismatch  where  the  AG  included  un-modeled  channel  coefficients.  In  the 
present  case  AG  is  due  to  channel  motion  induced  estimation  errors. 

The  form  of  the  feedforward  coefficients  implies  that  there  is  a  nonlinear  rela¬ 
tionship  between  the  channel  perturbation  and  the  equalizer  perturbation.  In  the 
low  SNR  regime  when  the  observation  noise  is  white,  the  term  (I  +  Q-1[n]W[n])  is 
nearly  diagonal  (diagonally  dominant)  since  the  variance  of  the  noise  is  much  larger 
than  the  channel  coefficients  and  the  channel  coefficient  perturbation,  so  the  equalizer 
perturbation  is  approximately  linear  [35]. 

This  model  of  perturbation  is  not  properly  constrained  because  any  position  of  the 
channel  convolution  matrix  can  be  perturbed  where  in  normal  operation  only  the  first 
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column  of  the  channel  convolution  matrix  is  perturbed.  The  next  section  presents  a 
model  without  this  inaccuracy. 


6.4.2  Channel  convolution  matrix  update  model 

In  the  model  from  the  previous  section,  all  elements  of  the  channel  convolution  matrix 
could  change  from  one  time  step  to  the  next  (the  entire  channel  convolution  matrix 
was  perturbed).  When  updating  the  equalizer  coefficients  every  symbol,  the  majority 
of  matrix  elements  are  just  shifted  to  new  positions;  only  the  first  column  contains 
values  not  previously  in  the  matrix.  The  update  model  presented  in  this  section 
examines  the  dynamics  of  the  equalizer  coefficients  using  the  this  constrained  update 
of  the  channel  convolution  matrix. 

The  channel  is  assumed  to  be  varying  according  to  a  Markov  model, 


g[n  +  1]  =  ctg  [n]  +  v[n], 


(6.4.5) 


where  a  is  the  Markov  coefficient  (same  for  all  elements  of  channel  vector)  and  v[n]  is 
the  process  noise  vector.  Statistical  correlations  Channel  coefficients  have  statistically 
correlated  variation  when  the  corresponding  elements  of  the  process  noise  vector,  of 
v[n],  are  statistically  correlated. 

To  simplify  notation,  two  new  matrices  are  defined:  the  “shift”  matrix,  S m,  and 
the  “choice”  matrix,  Cm,  the  subscript  M  is  the  matrix  dimension.  The  shift  matrix, 
S m ,  is  an  M  x  M  square  matrix  which  has  ones  along  the  diagonal  directly  above 
the  main  diagonal  and  zeros  everywhere  else. 


0  1  0 
0  0  1 


S 


M  ~ 


0  0  0 
0  0  0 


0 

0 


1 

0 


MxM 


(6.4.6) 


Multiplying  by  this  matrix  shifts  the  existing  data  to  new  positions  from  one  time 
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step  to  the  next. 


The  choice  matrix  serves  two  functions:  first,  it  selects  the  leftmost  column  from 
the  channel  convolution  matrix  and,  second,  it  scales  that  column  by  the  Markov 
coefficient,  a.  Cm  is  a  square  M  x  M  matrix  that  has  a  as  the  bottom  left  element. 
All  remaining  elements  are  all  zero. 


ag  0  0 
0  0  0 

0  0  0 
0  0  0 


0 

0 


0 

0 


MxM 


(6.4.7) 


If  N  is  the  total  number  of  channel  coefficients  and  Lg  is  the  number  of  feedforward 
equalizer  coefficients  per  feedforward  section,  then  the  channel  convolution  matrix  has 
dimension  (Lg+N  —  1)  x  Lg.  The  update  equation  for  the  channel  convolution  matrix 

is 

G[n  +  1]  =  SfJ+N_1G[n]SilI  +  G[n]Ci,  +  T[n]  (6.4.8) 

The  matrix  T  [n]  has  dimensions  (Lg  +  N  —  1)  x  Lg  and  contains  the  zero-padded 
realization  of  the  process  noise  at  time  n  in  the  first  column  and  all  other  elements 
are  zero.  The  noise  realizations  from  previous  time-steps  are  already  included  in  the 
convolution  matrix,  G  [n] ,  so  past  noise  realizations  are  not  included  in  Y[n]. 

Including  the  variation  model  into  the  DFE  formulation  from  eq.  (2.4.20)  gives 

hfr [n  +  1]  =  (Gq  [n  +  1] G0 [n  +  1]  +  R„)  1go[u  +  1] 

=  ((Sf  G0[n]SL  +  G0[n]Ciff  +  T0[n])*(S£  G0[n]SL  +  G0[n]Ciff  +  Yofn]))-1 
x  ((Sf  G0[n]SL  +  G0[n]CLff  +  T0[n])s) 

(ag[n]  +  v[n\)H(ag[n]  +  v[n\)  ( ag[n ]  +  v[n\)HG'0[n  +  1] 

G  O  +  l](ag[n]  +  v  [n] )  G''H  [n\G''[n\ 

x  ((SfG0[n]SL  +  G0[n]CLff  +  Y0[n])s)  (6.4.9) 
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(6.4.10) 


hfb  [n  +  1]  =  — Gfb[n  +  l]h  s[n  +  1] 

=  -(S^Gfb^S^  +  GftMC^  +  TfctnDMn  +  1]. 

In  the  above  expressions,  s  is  the  selection  vector  described  in  Section  2.4,  v[n]  is 
the  first  column  of  the  T[n]  matrix  corresponding  to  the  process  noise  vector.  The 
subscripts  o  and  fb  indicate  the  matrix  partition  described  in  eq.  (2.4.18). 

The  prime  notation  is  used  to  indicate  channel  convolution  matrices  that  have 
been  shifted  in  a  particular  way:  a  single  prime  (he.  G')  indicates  that  all  entries  in 
the  channel  convolution  matrix  have  been  shifted  to  the  right  (the  last  column  was 
removed  and  a  column  of  zeros  was  appended  to  the  left).  A  double  prime  (he.  G") 
indicates  that  all  of  the  entries  have  been  shifted  right  and  down,  shifting  in  zeros 
from  the  left  and  top. 

This  model  describes  how  changes  in  the  channel  impulse  response  propagate 
through  the  channel  convolution  matrix  product.  The  change  first  manifests  itself  in 
the  top  and  left  of  the  channel  convolution  matrix  product.  During  the  next  L  —  1 
time  steps  (for  a  total  of  L  steps)  the  change  would  move  up  the  channel  convolution 
matrix  one  row  at  a  time  and  move  through  the  channel  convolution  product  matrix 
(G[n]G^[n])  from  the  right  and  bottom  and  work  toward  the  top  and  left. 

The  trace  of  the  matrix  is  equal  to  the  sum  of  the  eigenvalues  of  the  matrix. 
The  matrix  Q  [n]  is  defined  as  Q  [n]  =  [n]  G0  [n]  +  R^.  The  Lfj  —  1  x  Lg,  —  1 

principle  submatrix  of  Q  [n]  is  denoted  as  Q  P  [n] .  The  bottom  right  element  of  Q  [n] 
is  represented  by  the  symbol  qLsLs[n\  and  the  term  hg[n]  is  defined  as  hg[n]  =  (a  — 
l)g[n]  +  v[n\. 

The  Poincare  separation  theorem  states  that  if  the  eigenvalues  of  Q[n]  (denoted 
A Qj,  i  =  1,  •  •  •  ,Tff)  and  QP[n]  (denoted  A q^,  i  =  1,  •  •  •  ,  L^)  are  ordered  from 
greatest  to  least,  then  the  following  relationship  between  the  eigenvalues  is  satisfied, 
[38] 

Aq,i  >  AQii  >  Aq;2  >  •  •  •  >  ^q,lb- i  >  A (6.4.11) 
This  relationship  bounds  the  changes  in  each  of  the  eigenvalues  from  one  step  to  the 
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next  (with  the  least  eigenvalue  lower  bounded  by  zero).  In  addition,  at  each  time 
step  the  trace  of  Q[n]  changes  by  A q  =  |hg[n]|2  —  [Q[n]] (LgLgy  so  the  change  in 
the  greatest  eigenvalue  is  also  bounded  and  so  the  change  in  energy  of  the  channel 
convolution  matrix  product  is  bounded.  Although  this  argument  shows  that  the 
changes  in  the  spectrum  of  the  matrix  Q[n]  is  bounded,  empirically,  the  changes  are 
not  only  bounded  but  smooth.  Therefore,  the  change  in  equalizer  coefficients  should 
also  be  smooth. 

When  the  observation  noise  is  white  with  variance  the  diagonal  matrix  =  er^I 
acts  as  a  regularization  term.  This  limits  the  maximum  eigenvalues  of  Q_1[n]  to  oy1, 
which  limits  the  maximum  change  in  the  energy  of  the  equalizer  coefficients.  In  this 
case,  when  the  SNR  is  low,  the  diagonal  noise  correlation  matrix  dominates  in  Q[n], 
so  Q [n]  becomes  nearly  diagonal  (diagonally  dominant).  This  implies  that  the  per¬ 
turbation  of  the  equalizer  coefficients  is  approximately  linear  with  similar  dynamics 
to  the  channel  coefficients.  This  result  will  also  be  seen  when  analyzing  the  equalizer 
coefficients  using  a  linear  perturbation  model. 

The  structure  of  the  updates  to  Q  [n] ,  where  the  matrix  indicates  that  at  each 
step,  the  matrix  update  is  rank  two.  Thus  a  low-complexity  method  for  updating  the 
CEB-DFE  could  be  constructed  where  the  inverse  matrix  is  updated  using  a  rank 
two  update.  This  idea  is  not  explored  further  in  this  thesis,  but  merits  future  study. 

The  model  presented  in  this  section  shows  the  exact  dependence  of  the  equalizer 
coefficients  on  the  change  in  the  channel  coefficients;  few  parts  of  the  equalizer  matrix 
equation  are  changing  from  one  time-step  to  the  next.  In  the  next  subsection,  the 
variation  of  this  complete  model  is  simplified  into  a  block  variation  model. 

6.4.3  Block  variation  model 

The  channel  convolution  matrix  update  model  introduced  in  the  last  section  is  accu¬ 
rate  but  cumbersome.  To  ease  use  of  an  equalizer  coefficient  variation  model,  a  block 
variation  model  is  introduced  where  the  channel  convolution  matrix  is  only  updated 
every  Lq  time  steps,  where  Lr  is  the  number  of  DFE  feedforward  coefficients.  The 
update  will  change  the  whole  matrix  instantaneously  so  the  changes  do  not  propa- 
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gate  through  as  in  the  previous  model,  which  simplifies  the  analysis  of  the  equalizer 
coefficient  dynamics. 

To  model  block  changes,  eq.  (6.2.1)  is  applied  to  the  whole  channel  convolution 
matrix  rather  than  only  a  channel  vector. 

G  [n  +  1]  =  «G  [n]  +  T  [n]  (6.4. 12) 

where  T  [n]  is  now  a  fully  populated  channel  noise  matrix  appropriately  zero-padded 
to  account  for  shifting  effects.  Substituting  this  model  into  eq.  (2.4.20),  the  channel 
equalizer  coefficients  can  be  written  as 

hfr|n  +  1]  =  (Gq  [n  +  1] G0 [n  +  1]  +  R;,)  1go['^  +  1] 

=  ((«G0[n]  +  T0[n])fl(aG0[n]  +  T0[n]))_1(ag0  +  u0[n])* 

=  (|a|2G,f  [n]G0[n]  +  aT^[n]G0[n]  +  a*G^[n]Y0[n]  +  T^[n]T0[n])_1x 
(ag0  +  u0[n])*  (6.4.13) 

hfb  [n  +  1]  =  —  Gfb[n  +  l]hff[n  +  1] 

=  —  (aGfb[n]  +  Tfb[n])hff[n  +  1].  (6.4.14) 

This  model  highlights  the  effect  a  channel  perturbation  has  on  the  equalizer  co¬ 
efficients.  This  model  behaves  as  if  the  channel  perturbed  at  some  time,  followed 
by  L  time  steps  where  the  channel  maintains  the  perturbed  value.  The  equalizer 
coefficients  all  change  simultaneously  from  one  steady  state  value  to  the  next.  In  this 
case,  the  matrix  T  [n]  is  full  column  rank  so  there  are  not  the  same  eigenvalue  bounds 
shown  for  the  previous  model. 

This  model  also  highlights  the  nonlinear  relation  between  the  equalizer  coeffi¬ 
cients  and  the  channel  coefficients.  Even  when  the  channel  coefficients  have  simple 
dynamics,  such  as  the  Markov  model,  the  change  in  equalizer  coefficients  to  a  channel 
perturbation  is  unclear.  This  idea  of  a  block  changing  channel  is  used  again  when 
studying  the  correlation  structure  of  the  equalizer  coefficients. 

With  the  exception  of  the  block  variational  model,  which  is  used  as  a  tool  for 
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describing  the  channel  when  evaluating  equalizer  coefficient  correlation,  the  three 
models  presented  in  this  section  will  not  be  used  again  in  this  thesis.  They  are  in¬ 
cluded  in  this  chapter  to  complete  the  discussion  and  allow  one  to  visualize  the  precise 
relationship  between  the  channel  and  equalizer  coefficients.  The  channel  convolution 
matrix  update  model  is  especially  illustrative  of  the  one-step  update  of  the  equal¬ 
izer  coefficients  and  highlights  the  nonlinear  relationship  between  the  channel  and 
equalizer  coefficients. 

6.5  First-order  Taylor  expansion  of  equalizer  coef¬ 
ficients 

Another  method  for  determining  the  change  in  equalizer  coefficients  due  to  a  change 
in  the  channel  coefficients  is  the  Taylor  expansion  of  the  equalizer  coefficients.  This 
provides  a  linearized  model  of  the  equalizer  coefficient  dynamics  and  a  way  to  compare 
the  channel  coefficient  dynamics  and  the  equalizer  coefficient  dynamics. 

The  derivation  of  the  first-order  Taylor  expansion  is  presented  in  two  steps:  first, 
the  derivation  is  given  for  a  scalar  channel  (he.  a  channel  with  one  coefficient).  Second, 
the  derivation  is  presented  for  a  Taylor  expansion  of  and  equalizer  with  more  than  one 
ceofficient  (a  vector  equalizer).  The  channel  length  and  equalizer  length  are  assumed 
to  be  equal  throughout  these  derivations.  The  time  index  is  dropped  for  clarity. 

6.5.1  Scalar  equalizer  coefficient  based  on  scalar  channel  per¬ 
turbation 

To  gain  intuition  into  the  dependence  of  the  feedforward  equalizer  coefficients  on 
the  dynamics  of  the  channel  coefficients,  the  scalar  channel  and  scalar  equalizer  are 
analyzed.  For  the  scalar  channel,  the  first  order  Taylor  expansion  answers  the  question 
of  “What  is  the  (approximate)  magnitude  of  the  equalizer  coefficient  perturbation, 
Sh  =  h(g  +  8g,  g*  +  5g*)  —  h(g ,  g*),  caused  by  a  channel  impulse  response  perturbation 
<5g?”  The  scalar  channel  provides  a  clear  view  of  the  interplay  between  the  dynamics 
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of  the  channel  and  the  dynamics  of  the  equalizer. 

For  a  scalar  channel,  the  equalizer  coefficient  that  minimize  the  MMSE  cost  func¬ 
tion  is 


h{9,9*)  = 


9 


9*9  +  ' 

The  Taylor  expansion  of  the  equalizer  coefficient  about  the  channel  coefficient  is 


(6.5.1) 


dh  dh 

h(g  +  5g,g*  +  5g *)  «  %,<?*)  +  —Sg  +  ^ Sg * 


(6.5.2) 


After  evaluating  the  partial  derivatives  and  rearranging  terms,  the  Taylor  expan¬ 
sion  becomes 


ut  ,  x  *  ,  x  *\  (g  +  Sg)*  fg*Sg  +  gSg*  ,  ,  * 
h{9  +  °9i9  +  S9  )  =  ,  _9  -  (  , — 2~  I  h{9 , 9 


99*  +  at 


99*  +  ai 


(6.5.3) 


The  expression  for  the  linear  perturbation  model  of  the  equalizer  coefficient  is 


h(g  +  Sg,  g*  +  Sg*)  =  h(g,  g*)  +  Sh. 


(6.5.4) 


From  eq.  (6.5.3),  the  equalizer  coefficient  perturbation  term  is 


5h  = 


Sg* 


99*  +  <*l 


g  Sg  +  gSg*  ,  l  f  , 

- T5 — 2~  9 

99*  + 


(6.5.5) 


Derivation  of  first-order  Taylor  expansion  for  scalar  equalizer 


Throughout  this  chapter,  a  variable  and  its  conjugate  are  treated  as  two  separate 
complex  variables,  as  done  in  [9];  the  partial  derivative  with  respect  to  a  variable 
treats  the  conjugate  of  the  variable  as  a  constant. 

The  partial  derivatives  of  h  with  respect  to  g  and  with  respect  to  g*  is 


dh  =  {g*f 

dg  (gg*  +  a l)2 

dh  =  gg*  |  1 

dg*  (gg*  +  rf)2  gg*  +  °l' 
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The  linear  perturbation  model  of  the  equalizer  coefficient  around  the  point  g  is 


%  +  fig,  9*  +  fig*) 


=  Ha,g*)~ 
(9  +  sgy 

99*  +  H 


(. 9 


*  \2 


-,&g~ 


99 


M  + 


(99*  +  H)2  (99*  +  H)2  99*  +  H 


fig* 


9*  fig  +  gSg* 
99*  +  H 


Hg,g*)- 


The  last  equality  comes  from  substituting  eq.  (6.5.1)  for  both  expansion  and  simplifi¬ 
cation  of  terms.  Using  the  first-order  Taylor  expansion,  the  equalizer  coefficients  can 
be  written  as 


Hg  +  5 9,9 *  +  $9*) 


h(g,g *)  +  fih 


Ha,  9 *)  + 


fig* 

99*  +  H 


fg*fig  +  gfig*\ 
V  99* +  H  ) 


h(9,9*), 


which  implies 


Sh 


fig* 

99*  +  H 


f  g*fig  +  gfig*\ 

V  99*  +  H  ) 


h(9,9*)- 


Interpretation  of  results  for  scalar  equalizer 

There  are  two  interesting  quantities  that  can  be  extracted  from  the  first  order  Taylor 
expansion  of  the  equalizer  coefficients.  The  first  is  the  normalized  change  in  the  equal¬ 
izer  coefficients  verses  the  channel  coefficients.  The  second  quantity  is  the  behavior 
of  the  linear  model  in  the  extreme  SNR  regions. 

Using  eqns.  (6.5.1)  and  (6.5.5),  the  normalized  change  in  the  equalizer  coefficient, 
y,  can  be  written  in  terms  of  the  normalized  change  of  the  channel  coefficient,  y  as 


Sh 

~h 


Sg*  _  f  g*Sg+g5g* 
gg*+ol  \  gg*+<?l 


fif_ 

9* 


9* 

gg*+°-l 


Sg  |  $g* 
g  '  g* 


fig*  f  oj  \ 
9*  \\g\2  +  v2J 


h(g,9 *) 


fig  (  \9\2  ) 

9  \\g\2  +  Hj 
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(6.5.6) 


Thus,  the  normalized  change  in  the  equalizer  coefficient  is  a  combination  of  the 
normalized  change  of  the  channel  coefficient  and  its  conjugate.  This  implies  that 
|xl  —  >  *-e-  the  absolute  normalized  change  of  the  equalizer  coefficient  is  less 

than  or  equal  to  the  absolute  normalized  change  in  the  channel  coefficient,  and  so  the 
equalizer  coefficient  should  change  slower  than  the  channel  coefficient  in  this  model. 
Empirically,  this  turns  out  not  to  be  the  case,  which  implies  that  the  linearized  pertur¬ 
bation  model  does  not  accurately  capture  the  dynamics  of  the  equalizer  coefficients. 
Analysis  of  Sh  is  still  illuminating  in  the  at  extreme  ranges  of  SNR. 


At  high  SNR  where  \g\2  <C  o~l,  eq.  (6.5.5)  is  approximated  instead  as 

8htt5-jh(g,g*).  (6.5.8) 

The  change  in  the  equalizer  coefficient  in  the  high  SNR  region  still  depends  on  the 
current  equalizer  coefficient  value. 


One  result  of  these  derivations  is  that  at  low  SNR,  the  dynamics  of  equalizer 
coefficient  and  the  channel  coefficient  are  very  similar.  At  high  SNR  the  dynamics  of 
the  equalizer  and  channel  coefficients  are  very  different.  In  later  section  it  is  shown 
that  the  coherence  time  of  the  equalizer  coefficients  in  the  high  SNR  region  decreases 
as  with  the  channel  coefficient  correlation. 
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6.5.2  Vector  equalizer  coefficients  based  on  vector  channel 


A  more  realistic  scenario  than  the  scalar  case  is  the  multi-coefficient  channel  and  a 
multi-coefficient  equalizer.  When  the  channel  has  more  than  one  coefficient,  however, 
the  mathematics  is  more  involved.  This  section  presents  the  first-order  Taylor  ex¬ 
pansion  of  the  multi-coefficient  equalizer  which  provides  a  method  for  comparing  the 
dynamics  of  the  equalizer  and  channel  coefficient  dynamics. 


Only  the  DFE  feedforward  coefficient  dynamics  are  analyzed.  This  is  done  because 
the  feedback  coefficient  dynamics  are  similar  to  the  channel  coefficient  dynamics.  The 
subscript  g  is  dropped  from  all  of  the  equalizer  coefficient  labels  because  only  one  part 
of  the  equalizer  is  being  analyzed.  For  simplification,  the  channel  is  assumed  to  change 
in  block  increments  so  the  channel  convolution  matrix  is  Toeplitz. 


The  first  order  Taylor  expansion  of  the  DFE  feedforward  (or  LE)  coefficients  is 


h(g  +  5g,  g*  +  Sg*)  «  h(g,  g*)  +  |^Jg  + 


(6.5.9) 


where  ||  is  the  Jacobian  of  h  with  respect  to  g.  The  column  vector  Jg[n]  is  a  channel 
coefficient  perturbation.  The  elements  of  Jg[n]  (at  time  n)  are 


Sg[n]  =  5g[n,  0]  5g[n,  1]  •  •  •  5g[n,  N  —  1] 


(6.5.10) 


Substituting  for  the  Jacobian  matrices  in  eq.  (6.5.9),  the  first-order  perturbation 
model  of  the  equalizer  coefficients  is 


h(g  +  6g,  g*  +  <?g*)  ~  h(g0,  g»  +  Q-Vg*  -  Q-1  [iG0«G„  +  GfiGo]  h(g0,  g;). 

(6.5.11) 
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The  matrix  <5Gq  is  an  upper  triangular  matrix, 


5g[n,0\ 

Sg[n,  1] 

5g[n,2\  ■■ 

•  Sg[n,N-  1] 

0 

Sg[n,  0] 

Sg[n,l]  ■■ 

•  Sg[n,N-  2] 

SG0[n]  = 

0 

0 

Sg[n,  0] 

•  Sg[n,N-  3] 

0 

0 

0 

Sg[n,  0] 

where  5g[n,i\  is  the  ith  element  for  the  Jg[n]  vector  at  time  n.  The  channel  is  modeled 
as  block  changing,  so  all  elements  have  the  same  time-index  n.  The  matrix  JGo  has 
a  similar  structure  to  the  reduced  channel  convolution  matrix 


g[n,  o] 

g[n,  1] 

g[n,  2] 

•  g[n,N- 1] 

0 

g[n,  0] 

g[n,  1]  •  ■ 

•  g[n,N-  2] 

Go[n]  — 

0 

0 

g[n,  0] 

•  g[n,  N  —  3] 

0 

0 

0 

g[ni  o] 

In  this  matrix,  g[n,  i]  is  the  ith  coefficient  of  the  channel  impulse  response  vector  g  [n] . 
The  channel  is  assumed  to  be  block  changing  so  all  channel  coefficients  again  have  the 
same  time  index.  For  the  remainder  of  this  section,  the  time-index  n  is  suppressed 
for  brevity  so  g[i\  =  g[n,i]. 


Derivation  of  first-order  Taylor  expansion  of  vector  equalizer 

Recall  from  above  that  the  first-order  Taylor  expansion  of  the  equalizer  coefficients 
with  respect  to  the  channel  coefficients  is 

h(g  +  Sg,  g*  +  5 g*)  fa  h(g,  g*)  +  |^hg  +  |^Jg* 

When  there  are  N  channel  coefficients,  the  Jacobian  of  the  equalizer  coefficients 
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with  respect  to  the  channel  coefficients  are  defined  as 


Jfh  P-)  =  —  =  j9h_  ...  9h  1  ('6  5  14') 

1  ,g  dg  L^[0]  dg[  1]  dg[N-l]\  ■  K  ■  ■  ) 

Similarly,  the  Jacobian  of  the  channel  coefficients  with  respect  to  the  conjugate  of  the 
channel  coefficients  is 


dh 

dg*[N- 1] 


(6.5.15) 


In  this  derivation,  a  complex  variables  and  their  conjugates  are  assumed  to  be 
two  different  variables  as  proposed  in  [9].  Also,  the  following  matrix  identity  is  used 
throughout  the  derivation  (see  e.g.  [54]), 


^  =  -A"W^A-W 


(6.5.16) 


In  this  identity,  A (t)  is  an  invertible  matrix  whose  elements  are  functions  of  a  pa¬ 
rameter  t. 

Recall  from  eq.  (2.4.20)  that  the  DFE  feedforward  coefficients  are 


h  =  [G^G0  +  R^Gfco  =  Q^S*0  (6.5.17) 

where  et  is  a  column  vector  with  a  ’1’  in  the  (i  +  l)th  position  and  all  other  elements 
zero,  e.g. 

r  i T 

e2=  0  0  1  0  ■■■  0  (6.5.18) 

The  column  vector  gg  is  the  complete  channel  impulse  response  vector  and  the 
conjugate  transpose  of  the  first  row  of  the  reduced  channel  convolution  matrix,  Go- 

«S  =  [s*[0]  <?•[!]  •••  9*[Af-l]]T  (6.5.19) 

Defining  the  matrix  Q  as 

Q  =  G^G0  +  Rjy,  (6.5.20) 
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the  partial  derivative  of  the  equalizer  coefficients  is 


<9h 

dg[i 


d 


a9WQ"g» 

-Q-'IjrQ-'gS 

dm 

-Q-1G^Sivh(g,g* 


(6.5.21) 


The  second  equality  comes  from  substituting  eq.  (6.5.16)  into  the  first  equality.  The 
matrix  Sn  is  the  shift  matrix  defined  in  eq.  (6.4.6)  and  the  matrix  S°N  is  assumed  to 
be  the  identity  matrix.  The  last  equality  comes  from  evaluating  the  partial  derivative 
with  respect  to  the  channel  coefficient  g\i\.  The  noise  correlation  matrix  R„  has  no 
dependence  on  g[i]  so  its  derivative  evaluates  to  zero.  The  reduced  channel  convolu¬ 
tion  matrix  is  Toeplitz  so  the  Jacobian  is  also  Toeplitz.  To  obtain  the  last  equality, 
the  relation  h(g,  g*)  =  Q_1gg  from  eq.  (6.5.17)  is  substituted  into  the  second  equality. 


The  partial  derivative  of  the  equalizer  coefficients  is 


<9h 

dg*[i 


d 


d<f[i]Q  lg° 

-Q'SrQ'go  +  Q1  dg° 


dg*  [ip  ou  ^  dg*\i] 
-Q-1  (syHG0h(g,g*)  +  Q -1ei. 


(6.5.22) 


The  additional  term  Q  xej  is  the  ith  column  of  the  inverse  of  the  matrix  Q.  The 
complete  Jacobian  matrices  can  be  constructed  from  the  component  results 


</(h,g) 
J (h,  g*) 


-Q-1GfSSrh(g,g*) 

— Q_1(S^)'ffG0h(g,  g*) 


— Q'1G^S^_1h(g,  g*)  _ 
-Q-1(S^-1)^G0h(g,r) 


(6.5.23) 

+  Q"1 

(6.5.24) 


The  quantities  of  interest  in  eq.  (6.5.9)  are  the  Jacobian  matrices  times  the  chan¬ 
nel  coefficient  perturbation  vectors,  i.e.  J(h,  g)Jg  and  J(h,  g*)Jg*.  The  product 


179 


J(h,g)hg  is 


N-l 

J(h,g)5g  =  Q~1gosnHs,S*) 

^[*]Sjvl  h(g>S*) 

=  -Q-1G0^G0h(g,g*),  (6.5.25) 

where  the  matrix  hG0  is  defined  in  eq.  (6.5.12).  Similarly,  the  product  J(h,  g)<5g  is 

J(h,  g)Sg  =  -Q-'SG^GMs,  g‘)  +  Q-'Sel  (6.5.26) 

Substituting  the  relations  from  eqns.  (6.5.25)  and  (6.5.26)  into  eq.  (6.5.9),  the 
first-order  Taylor  expansion  of  the  equalizer  coefficients  is 

h(g  +  Sg, g*  +  5g*)  «  h(g„,  gS)  +  q-‘5g*  -  Q" 1  [6G"G„  +  G«5G„]  h(g,  g*) 
The  equalizer  coefficients  from  a  channel  perturbation  can  be  written  as 
h(g  +  Sg,  g*  +  Sg*)  =  h(g,  g*)  +  Sh 

From  the  first  order  Taylor  expansion,  the  perturbation  term  is 

Sh  «  Q-Mg*  -  Q-1  [hG^G0  +  G»SG0]  h(g,  g*)  (6.5.27) 

Notice  that  the  relation  for  the  vector  equalizer  reduces  to  the  relation  for  the  scalar 
equalizer  from  eq.  (6.5.5)  when  the  number  of  equalizer  and  channel  coefficients  is 
reduced  to  one. 

Interpretation  of  results  for  vector  equalizer 

The  form  of  the  equalizer  perturbation,  <5h,  at  high  and  low  SNR  reveals  the  equalizer 
dynamics  in  these  regions.  The  eigenvalues  of  the  matrix  GqG0  are  denoted  X;J.t  and 


-l  pff 


=  -Q  G 


'N- 1 


i= 1 
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the  eigenvalues  of  the  matrix  R„  are  denoted  A rg  for  i  —  1,  •  •  •  ,  N.  The  eigenvalues 
are  ordered  from  greatest  to  least,  i.e.  Xgj  is  the  greatest  eigenvalue  of  the  matrix 
G(f  G0. 

At  low  SNR,  Xr!N-i  3>  ASii  so  G^G0  +  R^  ~  R„.  Treating  GjfG0  as  a  Hermitian 
perturbation  matrix,  the  maximum  perturbation  of  the  eigenvalues  of  R^  is  Aff.i . 
Since  this  perturbation  is  small,  the  eigen-structure  of  R„  will  not  change  noticeably 
when  perturbed  by  G^G0  [99]. 

Using  this  assumption,  eq.  (6.5.27)  becomes 

5h  =  R^^g*  +  R,-1  [AG^G0  +  G0^Go]  Rw'go 

«  R._15go  (6-5.28) 


The  term 

R,-1  [AG^Go  +  G0hAG0]  R.'^ 

is  approximately  zero  since  the  inverse  noise  correlation  term  dominates.  The  equal¬ 
izer  coefficients  have  the  form  of  a  whitened  match  filter. 

When  the  SNR  is  low  eq.  (6.5.28)  can  be  used  to  approximate  the  perturbed 
equalizer  coefficients  as 


h(g  +  5g,g*  +Sg*)  «  R„  \g  +  5g)*.  (6.5.29) 

At  low  SNR  the  dynamics  of  the  equalizer  coefficients  and  the  channel  coefficients  are 
equivalent.  A  perturbation  of  the  channel  coefficients  causes  a  proportional  change 
in  the  equalizer  coefficients. 

Rearranging  the  relation  for  the  equalizer  coefficient  perturbation,  <5h  so  the  terms 
based  on  Ag  are  separated  from  those  based  on  6 g*  simplifies  the  analysis  at  high  SNR. 
A  relationship  between  the  channel  coefficients  and  the  channel  convolution  function 
that  simplifies  the  derivation  is 


Ag*  =  5G%e0 


(6.5.30) 
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Substituting  this  relation  into  the  equalizer  coefficient  perturbation  from  eq.  (6.5.27) 
gives  an  expression  for  the  equalizer  perturbation, 

Sh  =  Q~1SGH  (e0  -  G0h(g,  g*))  -  Q^G^SG^g,  g*).  (6.5.31) 

At  high  SNR,  the  reduced  channel  convolution  matrix  product  dominates  the 
noise  correlation  matrix,  i.e.  X9,n-i  A>  Ani,  so  Q  ^  Gjf  G0.  Substituting  this  approx¬ 
imation  and  definition  of  the  equalizer  coefficients  from  eq.  (2.4.20)  into  eq.  (6.5.31) 
gives 

-Sh  «  (G«G0)-'SG"  (I  -  G0(Gq  G0)-1Gq)  e„  -  (G0',G0)-1G0i,«G0h(g,  g") 

The  term  (I  -  G0(G"G0V1G")  is  a  projection  matrix  onto  the  null  space  of  G0 
[112].  Assuming  go  0  the  matrix  G0  is  full  rank,  so  the  null  space  is  empty.  The 
perturbation  term  becomes 

hh  ra  -(G0"Gcl)-1G"hG0h(g,g*)  (6.5.32) 

Therefore,  at  high  SNR  the  equalizer  coefficient  perturbation  depends  on  the  unper¬ 
turbed  values  of  the  equalizer  coefficients. 


6.6  Correlation  structure  of  equalizer  coefficients 

A  final  way  to  evaluate  the  equalizer  dynamics  is  by  determining  the  equalizer  coef¬ 
ficient  correlation  structure  based  on  the  correlation  structure  of  the  channel  coeffi¬ 
cients.  In  this  section,  the  extreme  SNR.  regions  are  analyzed. 

At  low  SNR,  the  equalizer  coefficient  correlation  is  equivalent  to  the  channel  co¬ 
efficient  correlation,  a  result  previewed  in  the  last  section.  When  the  channel  and 
equalizer  coefficients  have  the  same  dynamics,  the  CEB  and  DA  methods  have  simi¬ 
lar  error  performance,  which  is  supported  by  experimental  data  [91]. 

At  high  SNR,  equalizers  with  more  than  one  coefficient  are  well  approximated 
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by  a  scalar  equalizers;  the  analysis  of  a  single  coefficient  equalizer  produces  results 
applicable  to  the  multiple  channel  equalizer. 


The  estimated  equalizer  coefficient  correlation  functions  are  presented  in  the  next 
section.  These  functions  indicate  that  the  coherence  time  of  the  equalizer  coefficients 
is  less  than  the  channel  coefficients.  There  is  therefore  a  shorter  data  averaging 
window  for  the  equalizer  coefficients  than  the  channel  coefficients.  The  equalizer 
coefficients  calculated  using  the  DA  method  will  have  higher  error  than  CEB  equalizer 
coefficients.  Simulation  data  supports  this  hypothesis  of  the  relative  performance 
between  the  two  algorithms  [91]. 


Recall  from  eq.  (2.2.5)  that  DFE  feedforward  coefficients  are 


h ff[n]  =  Q [n\  1go[n]  =  (G^[n]G0[n]  +  ud2R!y)  ^[n],  (6.6.1) 


where  the  transmitted  symbol  energy,  a is  explicitly  reintroduced. 


In  this  section,  the  channel  is  assumed  to  be  Rayleigh  Fading  with  the  variance 
of  each  channel  coefficients  equal  to  cr2.  If  the  channel  coefficients  are  independently 
varying  and  the  noise  is  white  with  variance  <r2,  the  average  SNR  of  the  communica¬ 
tions  channel  is  —  ,  where  N  is  the  number  of  channel  coefficients. 

erf.  ’ 


Recall  that  the  ith  eigenvalue  of  G^G0  is  denoted  by  l  for  i  —  1,  *  •  •  ,  N,  and 
that  the  eigenvalues  are  ordered  from  greatest  to  least,  i.e. 

As. i  >  Xg,2  >  •  •  •  >  A9jtv 

Similarly,  the  eigenvalues  of  the  noise  correlation  matrix  are 

i  t'*  X'r  2  '''>  *  t'*  Aj.  ]y 
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6.6.1  Correlation  of  equalizer  coefficients  at  low  SNR 

At  low  SNR,  the  noise  energy  (scaled  by  the  transmit  symbol  energy)  dominates  over 
the  channel  energy  and  so  XV:n  ^  A9ii.  The  matrix  Q  [n]  can  be  approximated  as 

Q[n\  =  G$[n]G0[n]afTLu  « 

The  time  dependence  of  Q  [n]  is  removed  because  the  noise  statistics  are  stationary. 
The  DFE  feedforward  equalizer  coefficients  at  low  SNR  are 

hff[n]  «  R^_VrfgoW  (6.6.2) 

Note  that  the  equalizer  coefficients  in  eq.  (6.6.2)  are  the  whitened  match  filter  of  the 
channel  coefficients. 

The  correlation  matrix  of  the  channel  coefficients  at  lag  m  is  defined  as 

Rs[m]  =  E{g[n]g^[n  +  m]}.  (6.6.3) 

Similarly,  the  equalizer  coefficient  correlation  matrix  is 

R  h[m\  =  Ejhfnjh^fn  +  m]}  (6.6.4) 

Substituting  the  expression  from  eq.  (6.6.2)  into  eq.  (6.6.4),  the  equalizer  correlation 
becomes 

R h[m\  ~  RIy_1R*[m]RI/_1  (6.6.5) 

When  the  noise  is  white,  R„  =  cr^I,  the  equalizer  coefficient  correlation  matrix  at 
lag  m  is 

R h[m\  «  cr~4R*[m]  (6.6.6) 

This  relation  shows  that  the  equalizer  coefficient  correlation  is  a  scaled  version  of  the 
channel  coefficient  correlation.  By  normalizing  the  correlation  matrices  so  that  the 
maximum  value  is  one  the  channel  and  equalizer  coefficient  correlation  matrices  are 
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equal. 

Since  the  channel  and  equalizer  coefficients  have  the  same  correlation,  the  aver¬ 
aging  window  lengths  used  for  both  sets  of  coefficients  are  equal.  The  DA  and  CEB 
methods  of  calculating  the  equalizer  coefficients  therefore  have  the  same  error  perfor¬ 
mance  at  low  SNR.  This  has  been  confirmed  in  the  literature  [91]  and  is  shown  with 
simulation  data  later  in  this  chapter. 


6.6.2  Correlation  of  equalizer  coefficients  at  high  SNR 

At  high  SNR  (A >>  Ar>i),  the  received  data  correlation  matrix,  Q[n],  has  the 
approximation 

Q[n]«G0[n]G*[n].  (6.6.7) 

because  the  reduced  channel  convolution  matrix  product,  G0G^,  dominates  the  noise 
correlation  matrix.  The  dominant  term  of  Q[n]  is  the  reduced  channel  convolution 
matrix  product,  so  Q[n]  is  time-dependent. 

When  the  SNR  is  high,  the  equalizer  coefficient  correlation  matrix  is 


R h[m]  =  E[h[n]hH [n  +  m]]  ~  E 


(Gq  [n]  G0 [n] )  1  g() [n] go [m]  (G^ [m] G0 [m] ) 


(6.6.8) 

The  expectation  is  over  the  channel  impulse  response  coefficients.  Since  the  channel 
impulse  response  coefficients  appear  in  both  the  numerator  and  denominator  the 
expectation  is  hard  to  evaluate.  The  equalizer  correlation  matrix  is  also  no  longer 
a  linear  function  of  the  channel  correlation  matrix,  so  one  would  not  expect  the 
normalized  correlation  of  the  equalizer  and  channel  coefficients  to  be  equivalent. 


Scalar  approximation  of  vector  equalizer 

At  high  SNR,  the  terms  in  eq.  (6.6.1)  can  be  written  as 

(Gq  [n]G0[n])hff[n]  =  gj[n].  (6.6.9) 
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The  conjugate  of  the  channel  coefficients  are  the  vector  product  of  the  reduced  convo¬ 
lution  matrix  product  and  the  feedforward  equalizer  coefficients.  When  the  channel  is 
block  changing,  the  reduced  channel  convolution  matrix  when  the  feedback  equalizer 
spans  the  complete  delay-spread  is  given  in  eq.  (6.5.13). 

Using  the  column  vector  eo,  where  the  first  element  is  a  one  and  the  rest  are  zero, 
the  first  column  of  the  reduced  channel  convolution  matrix  is 

g[n,0]g*[n,0] 
g[n,0\g*[n,  1] 

G(?[n]Go[n]eo  =  g[n,0]g*[n,2]  ■  (6.6.10) 

g[n,0]g*[n,N  -  1] 

This  relation  implies  a  surprising  result, 

G^[n]G0[n]e0  =  g[n,0}g*[n\. 

The  MMSE  equalizer  coefficients  at  high  SNR  are 

h[n]  =  l/g[n,  0]  0  0  ...  0  •  (6.6.11) 

Upon  refiection,  this  result  is  not  too  surprising  because  the  MMSE  feedback 
section  removes  all  ISI,  so  inverting  the  first  channel  coefficient  is  the  optimal  equalizer 
with  no  noise.  Regardless  of  channel  length,  only  one  feedforward  coefficient  is  needed 
at  very  high  SNR.  This  is  not  meant  to  be  a  way  to  reduce  order  (although  it  does 
merit  further  study),  but  is  meant  to  provide  a  tractable  way  to  analyze  the  vector 
equalizer  using  a  scalar  approximation.  The  next  part  provides  an  analysis  of  the 
scalar  equalizer  correlation. 

In  practical  equalizer  implementations,  a  single  equalizer  coefficient  would  proba¬ 
bly  not  be  desirable  since  the  ISI  cancellation  is  not  perfect  clue  to  estimation  error. 
In  this  case,  the  feedforward  section  is  used  to  reduce  and  reshape  the  residual  error. 
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Expectation  of  single  coefficient  equalizer 

The  last  section  argued  that  at  high  SNR  regime  with  a  block-changing  channel  the 
DFE  feedforward  section  has  only  one  energetic  coefficient.  As  a  result,  the  analysis  of 
the  single  coefficient  equalizer  is  sufficient  to  characterize  the  full  feedforward  section. 
In  this  section,  a  statistical  analysis  is  given  for  the  single  coefficient  equalizer  showing 
that  the  equalizer  coefficient  has  finite  variance  and  thus  a  correlation  function  exists. 
In  the  next  section,  empirical  results  are  presented  which  verify  the  proposition  that 
the  equalizer  coefficients  have  a  shorter  correlation  time  than  do  channel  coefficients. 
Recall  from  eq.  (6.5.1)  that  the  single  coefficient  equalizer  is 


h 


9 


* 


9*  9  +  <rl' 


(6.6.12) 


The  channel  coefficients  are  assumed  to  be  Rayleigh  fading,  i.  e.  modeled  as  circularly- 
symmetric,  complex  Gaussian  random  variables.  One  property  of  circularly-symmetric 
random  variables  is  that  the  magnitude  and  phase  are  independent.  The  magnitude 
is  Rayleigh  distributed  and  the  phase  uniformly  is  distributed  from  0  to  27r  [70]. 

The  zero  mean  property  of  the  equalizer  coefficient  is  derived  first.  In  eq.  (6.6.12), 
the  complex  channel  coefficient  can  be  rewritten  in  magnitude-phase  form,  i.e. 


9  =  \g\eje3, 


to  give  an  alternative  relation  for  the  equalizer  coefficient 


w 


-jOg 


h  = 

\g\  +°l 

Since  6  and  \g\  are  independent,  the  expectation  of  the  channel  coefficient  is 


(6.6.13) 


E{h}=E{lwB,] 


\g\ 


=  E{e^}E{^_} 


=  0, 


(6.6.14) 
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since 


1  r2ir 

E{e~je°}  =  —  e~30de  =  O. 

2tt  J  o 


The  finite  variance  property  of  the  channel  coefficient  is  calculated  next.  Since 
the  equalizer  coefficient  is  zero-mean,  the  variance  is  E{|/?,|2}.  Substituting  the  ex¬ 
pression  for  the  equalizer  coefficient  from  eq.  (6.6.12)  and  evaluating  the  expectation 
(a  complete  proof  is  below),  the  equalizer  coefficient  variance  is 


V{\h\2} 


roc  -z 

-1  +  (1  +  p)ep  /  —dz 

J  p  z 


(6.6.15) 


In  this  expression,  p  —  a2 /a2  is  a  modified  version  of  the  inverse  signal  to  noise  ratio, 
cr2  is  the  variance  of  the  channel  coefficient,  and  a2  is  the  observation  noise  variance. 
The  integral  term, 

roc  -z 

E\  (p)  =  /  - dz 

is  a  special  integral  known  as  the  exponential  integral  function ,  with  well  known 
bounds  [1], 

-  In  ^1-1 — ^  <  epEi(p)  <  In  ^1  H — ^  .  (6.6.16) 

The  lower  bound  of  the  variance  is  interesting  for  small  values  of  p.  As  p  — >  0  (be. 
a2  — >  0),  the  variance  of  the  equalizer  coefficients  monotonically  increases  and  is 
bounded  away  from  zero.  This  implies  that  as  the  SNR  is  increased,  the  variance  of 
the  equalizer  coefficients  also  increases. 


The  upper  bound  implies  that 

(1  +  p)  In  ^1  +  ^  <  oo,  p  >  0, 
and  so  the  variance  of  the  equalizer  coefficient  is  finite, 


(6.6.17) 


Ei\h\2}=  ^[-1  +  (1  +  p)ePEi(p)\<°°,  P’ag>  °-  (6.6.18) 

Since  the  variance  is  bounded,  the  correlation  function  of  the  equalizer  exists.  The 
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correlation  function  of  the  channel  coefficients  are  examined  empirically  in  the  next 
section. 


Derivation  of  equalizer  coefficient  variance  The  variance  of  the  equalizer  co¬ 
efficient  is  E{|/?,|2}.  The  term  |/?,|2  is 


\h\2  = 


\£_ _ 

2  +  ^2)2' 


(6.6.19) 


The  channel  coefficient  g  is  complex  Gaussian  with  variance  a2.  This  implies  that 
(  =  \g\2  is  exponentially  distributed  with  a  PDF 


PdO 


(6.6.20) 


Using  this  PDF,  the  expectation  E{|/i|  }  can  be  directly  evaluated  using  the  ex¬ 


pression 


Emi2}  =  - 


Jo  (C  +  ffi^)2 


-c 

e^d(. 


(6.6.21) 


Making  a  change  of  variable,  x  =  C/0"2,  the  integral  becomes  slightly  cleaner, 

1  r°°  r 

>  =  y  l  (6A22> 

with  p  =  all  a*. 

To  evaluate  this  integral,  integration  by  parts  is  used  repeatedly.  The  first  few 
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solution  steps  are 


Emi2}  =  -3 


x 


a 


9  JO 


( X  +  1) 


re~pxdx 


at 


at 


at 


f°°  £~px 


~P  /  e  pxdx  +  p 

Jo  J  o  !  +  ^ 


dx  +  p  e  px  ln(l  +  x)dx 


i  o 


'•°°  e~Px 


— 1  +  p  J  ^  +  dx  +  p  J  e  px  ln(l  +  x)dx 

/OO  POO 

e~px  ln(l  +  x)dx  +  p  /  e~px  ln(l  +  x)dx 


a 


9  l 


-1  +  p(l  +  p)  J  e  px  ln(l  +  x)dx 


Another  change  of  variable  y  —  (1  +  x)  simplihes  the  remaining  integral, 


E{|/f}  =  A 

a9 


—  1  +  p(l  +  p)  /  e  p<yV  ln(y)dy 


a 


—  1  +  p(l  +  p)ep  /  e  py  ln(y)dy 
g  l  J  i 


One  final  change  of  variable,  z  =  py  and  some  further  simplihcation  get  the 
expectation  into  the  same  form  given  earlier, 


E{|h|2}  =  - 


at 


at 


a 


9  l 


-l  +  p(l  +Pyj'  e-lng)^ 

POO 

—  1  +  (1  +  p)ep  /  (e_z  In  (z)  —  e_z  ln(p))  dz 

J  p 

poo  -Z 

-1  +  (1  +  p)ep  /  —  dz 

.In  ^ 


6.7  Simulation  results:  equalizer  correlation 


In  this  section  simulations  results  are  presented  that  show  the  coherence  time  of 
equalizer  coefficients  is  less  than  the  coherence  time  of  the  channel  impulse-response 
coefficients. 
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Figure  6.7.1:  The  correlation  of  a  single-coefficient  Markov  channel  with  a  correlation 
window  of  iVwin  =  2400  symbols.  There  is  a  smooth  transition  from  the  channel 
correlation  down  to  a  minimum  correlation  as  the  SNR  increases. 


6.7.1  Single-coefficient  channel  with  Markov  correlation 

The  first  simulation  presented  is  a  one-coefficient  Markov  channel.  In  this  simulation, 
the  sampling  frequency  is  fs  =  2400  samples  per  second,  and  the  fading  rate  is  1/2 
second  (so  the  correlation  width  at  1/e  is  2400  samples).  The  channel  coefficients  are 
unit  variance  so  cr2  =  1. 

Figure  6.7.1  shows  the  correlation  function  for  a  single  coefficient  channel  with 
the  parameters  as  described  above.  The  channel  coefficients  are  shown  to  have  a 
longer  correlation  window  than  the  equalizer  coefficients.  Perfect  channel  knowledge  is 
assumed  when  calculating  the  equalizer  coefficients,  so  there  is  no  channel  estimation 
error. 

The  results  show  that  the  coherence  time  of  the  equalizer  coefficients  in  the  noise 
free  case  is  much  lower  than  the  channel  impulse  response  coefficient  correlation. 

Figure  6.7.1  also  shows  the  effect  of  reducing  the  SNR.  There  is  a  transition 
from  the  low-noise  (high  SNR)  regime  at  SNR  of  60  and  greater  to  the  high-noise 
(low  SNR)  regime  where  the  channel  coefficients  and  the  equalizer  coefficients  have 
approximately  the  same  correlation  function. 

No  attempt  has  been  made  to  analytically  capture  the  transition  from  low  SNR 
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Figure  6.7.2:  The  correlation  of  a  single-coefficient  Gaussian  channel  with  a  corre¬ 
lation  window  of  Nw-m  =  2400  symbols.  There  is  still  a  smooth  transition  from  the 
channel  correlation  down  to  a  minimum  correlation  as  the  SNR  increases,  with  a 
slightly  different  shape  than  the  AR(1)  channel. 


operating  regime  to  the  high  SNR  operating  regime.  The  simulation  results  imply 
that  the  coherence  time  of  the  equalizer  coefficients  is  monotonically  non-increasing 
with  SNR. 


6.7.2  Single-coefficient  channel  with  Gaussian  correlation 

Figure  6.7.2  plots  the  correlation  function  of  the  channel  and  equalizer  coefficients 
when  the  channel  has  a  Gaussian  shaped  correlation  function.  Similar  results  are 
observed  are  observed  with  this  correlation  function  shape  as  were  observed  for  a 
Markov  correlation  model.  At  high  SNR,  the  equalizer  coefficients  have  a  shorter 
correlation  time  than  the  channel  coefficients  regardless  of  the  shape  of  the  correlation 
function. 


6.7.3  10-coefficient  Markov  Correlated  channel 

This  sections  provides  simulation  results  which  verify  the  claims  that  the  multi- 
coefficient  equalizer  is  well  approximated  by  inverting  the  first  channel  impulse  re¬ 
sponse  coefficient  in  the  noise-free  regime.  A  10-coefficient  WSSUS  channel  model  is 
used,  where  each  coefficient  is  generated  independently  using  a  Markov  correlation 
model.  The  energy  in  each  coefficient  is  assumed  to  be  equal  and  the  direct-path  is 
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Figure  6.7.3:  Realization  of  the  10-coefficient  WSSUS,  AR(1)  channel  where  all  co¬ 
efficients  have  equal  variance.  The  sum  of  the  average  energy  of  all  the  channel 
coefficients  is  unity.  The  color  indicates  intensity  on  a  linear  scale. 


assumed  to  be  the  first  coefficient  (arbitrary). 

The  equalizer  is  a  DFE  where  perfect  channel  knowledge  is  assumed.  There  are  10 
feed-forward  coefficients  and  9  feedback  coefficients,  so  all  of  the  precursor  interference 
should  be  canceled.  With  perfect  channel  knowledge,  the  equalizer  can  use  the  current 
realization  of  the  channel.  Thus,  the  perfect  channel  knowledge  equalizer  is  the  MMSE 
equalizer. 

Figure  6.7.3  shows  the  complete  realization  of  the  channel  and  6.7.4  shows  the 
equalizer  coefficients  calculated  from  the  known  channel  impulse  response  realization. 
This  figure  shows  that  the  equalizer  coefficients  are  dominated  by  the  first  coefficient. 
Therefore,  the  use  of  a  one-coefficient  equalizer  model  for  analysis  is  justified. 

Using  simulated  data,  the  sample  correlation  coefficient,  PcoRR(5,_1[rb  0],  hs[n,  0]), 
between  the  inverse  of  the  first  channel  coefficient,  g~l[n,  0],  and  the  Erst  equalizer 
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Figure  6.7.4:  Realization  of  the  MMSE  feedforward  DFE  coefficients  for  the  channel 
shown  in  figure  6.7.3  with  no  noise.  Notice  that  the  first  coefficient  dominates  over  the 
others  for  all  time.  The  color  corresponds  to  magnitude  and  the  scale  is  20  log10(|h|). 


coefficient,  hgt[n,  0],  is 


Pcor.rO  1[n,0],hs[n,0]) 


F  E»=i(frfffc  °]  ~  l{i,  0]  ~  /yi) 

sfi  E£,i  Mi,  0]  -  M2  b  Ef=i  Is-1!*.  01  -  VI 


(6.7.1) 


=  0.9935  +j  0.0002 


where 

1  tv  i  N 

=  iv /?ff °]  fig-1  =  n^9 

i= 1  i= 1 

This  correlation  coefficient  is  very  close  to  1,  confirming  that  there  is  a  high 
correlation  between  the  inverse  of  the  first  channel  coefficient  and  the  first  feedforward 
equalizer  coefficient.  The  imaginary  part  of  the  correlation  is  nearly  0,  showing  that 
the  phases  are  well  correlated. 
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Figure  6.7.5:  Correlation  between  the  first  coefficient  of  the  equalizer  and  the  inverse 
of  the  first  coefficient  of  the  channel.  There  is  a  strong  linear  correlation  between  the 
two.  There  is  also  many  non-correlated  events  indicating  the  approximation  is  not 
perfect. 


Figure  6.7.5  shows  a  linear  relationship  between  the  magnitude  of  the  first  equal¬ 
izer  coefficient  and  the  inverse  of  the  magnitude  of  the  first  channel  coefficient.  There 
is  observable  noise  indicating  that  the  equalizer  coefficient  is  not  exactly  equal  to  the 
inverse  of  the  first  channel  coefficient.  The  high  correlation  coefficient  indicates  the 
they  are  nearly  equal  and  so  an  equality  approximation  is  justified. 

Figure  6.7.6  shows  the  normalized  correlation  function  of  the  first  equalizer  coeffi¬ 
cient  for  several  different  SNR  values.  At  low  SNR  the  equalizer  coefficient  correlation 
function  is  equivalent  to  the  channel  impulse-response  correlation  function.  A  similar 
effect  was  observed  for  the  one-coefficient  equalizer  in  the  previous  subsection.  There 
is  a  smooth  transition  from  the  low-noise  (high  SNR)  regime  where  one-coefficient  is 
dominant  and  the  correlation  is  low,  to  the  high-noise  (low  SNR)  regime  where  the 
correlation  function  of  the  equalizer  is  the  same  as  channel  impulse-response  corre¬ 
lation  function.  The  transition  region  for  the  multiple-coefficient  equalizer  extends 
over  a  wider  SNR  range  than  the  single-coefficient  equalizer. 
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Figure  6.7.6:  Correlation  function  for  the  first  coefficient  of  the  channel  and  the  first 
coefficient  of  the  equalizer  for  several  SNR  values.  Notice  that  there  is  a  smooth 
transition  from  the  correlation  of  the  channel  down  to  the  90dB.  This  trend  continues 
as  the  SNR  continues  to  be  increased  (not  shown).  The  transition  to  the  no- noise 
correlation  levels  happens  at  much  higher  SNR  than  for  the  single  coefficient  channel 
(shown  in  the  figure  at  200dB). 


To  complete  this  discussion,  the  CEB  and  DA  DFE  are  compared  using  the  sim¬ 
ulated  10-coefficient  Rayleigh  fading  channel.  Figure  6.7.7  is  the  MSE  results  using 
the  CEB  and  DA  DFE  algorithms. 

The  results  show  that  the  performance  of  the  DA  algorithm  levels  degrades  faster 
than  the  CEB  algorithm.  The  reason  is  that  the  coherence  time  of  the  equalizer 
coefficients  is  reduced  as  the  SNR  is  increased,  which  decreases  the  optimal  averaging 
window.  There  is  a  lower  limit  to  the  averaging  useful  averaging  window,  below  which 
the  MSE  increases  rapidly.  The  correlation  of  the  equalizer  coefficients  is  reduced  as 
the  SNR  increases  so  the  equalizer  coefficients  are  data  limited  at  a  higher  SNR  than 
the  channel  coefficients.  A  range  of  exponential  weighting  factors  were  used  and  the 
results  show  the  weighting  factor  with  the  lowest  MSE. 

Note  that  the  superior  MSE  performance  of  the  CEB  algorithm  depends  heavily 
on  the  accuracy  of  the  channel  model.  As  has  been  shown  in  other  chapters,  when 
the  channel  model  is  inaccurate,  the  performance  of  the  CEB  degrades  markedly 
compared  with  the  DA  equalizer. 
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Figure  6.7.7:  Comparison  of  the  CEB  and  DA  algorithms  using  a  10-coefficient 
Rayleigh  fading  channel. 

6.8  Discussion 

This  chapter  compared  the  performance  difference  of  the  CEB  equalizer  algorithm 
with  the  performance  of  the  DA  equalizer  algorithm.  At  low  SNR  both  methods  have 
nearly  equivalent  MSE  performance  since  at  low  SNR  the  observation  noise  is  the 
dominant  error  term.  Underwater  communication  systems  generally  operate  with  a 
low  enough  SNR  that  the  performance  of  the  DA  and  CEB  methods  is  equivalent, 
so  the  DA  methods  should  be  considered  when  designing  these  systems  since  the 
computational  complexity  of  the  DA  method  is  much  lower  than  the  CEB  method. 

When  the  SNR  is  low,  the  equalizer  coefficients  and  the  channel  impulse  response 
coefficients  have  the  same  correlation  structure,  so  the  DA  and  the  CEB  methods  had 
very  similar  MSE  performance.  This  transition  from  the  high-SNR  to  the  low-SNR 
regime  was  shown  to  be  a  transition  from  an  operating  regime  where  the  statistics  of 
the  received  data  correlation  matrix  are  time  varying  to  an  operating  regime  where 
the  statistics  are  time-invariant.  If  the  noise  also  had  time- varying  statistics,  the  CEB 
would  always  outperform  the  DA  algorithm. 
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Chapter  7 


Summary  and  conclusions 

7.1  Summary  of  results 

Equalization  is  a  very  useful  communication  component  for  overcoming  ISI  in  a  highly 
time-spread  channel.  The  underwater  environment  provides  a  particularly  challenging 
environment  for  equalization  due  to  very  long  delay  spreads  and  time  variability.  This 
thesis  looked  at  several  aspects  and  improvements  of  the  EW-DFE  applied  to  the 
underwater  acoustic  channel. 

Several  of  the  key  results  provided  in  this  thesis  include: 

•  The  effective  noise  correlation  matrix  used  in  the  computation  of  a  CEB-DFE 
includes  off  diagonal  elements  due  to  correlated  channel  motion.  The  statistics 
of  the  channel  motion  are  nearly  time-invariant  and  so  estimation  techniques 
that  assume  the  error  correlation  matrix  is  Toeplitz  both  reduce  computational 
complexity  and  improve  performance. 

•  In  shallow  water  communication  channels,  the  arrival  angles  of  the  multipath 
components  are  bounded  into  a  narrow  cone  of  angles.  Beams  can  be  formed 
which  span  this  angular  spread  to  capture  most  of  the  energy  and  do  nearly  as 
well  as  adaptive  beamforming  but  without  some  of  the  instabilities  that  result 
from  fully  adaptive  methods. 

•  The  number  of  multipath  arrivals  can  be  estimated  using  either  a  geometric 
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ray-path  model  when  environmental  data  is  available  or  using  y2  statistical 
matching  techniques.  These  techniques  provide  a  method  for  determining  the 
number  of  beams  that  should  be  used  for  either  an  adaptive  or  non-adaptive 
beamformer  to  both  improve  performance  and  reduce  computational  complexity 
in  a  time- varying  ocean  environment. 

•  Unmodeled  channel  impulse  response  coefficients  become  additional  noise  terms 
when  estimating  equalizer  coefficients  using  a  CEB  method.  This  may  cause  a 
estimated  noise  correlation  matrix  mismatch  which  leads  to  increased  MSE  at 
the  output  of  the  DFE.  Additional  processing  steps  can  be  used  to  mitigate  this 
effect  and  improve  performance. 

•  Channel  estimate  based  equalization  has  lower  MSE  than  direct  adaptation 
equalization  due  to  lower  temporal  correlation  of  equalizer  coefficients  at  high 
SNR.  As  the  SNR  is  reduced,  these  two  methods  perform  equivalently.  At  most 
practical  SNR  observed  in  experiments,  the  methods  are  practically  equal,  so 
the  DA  method  is  preferred  if  computation  complexity  is  an  issue. 

•  A  DA  DFE  is  not  sensitive  to  modeling  errors  since  the  parameters  are  all 
estimated  directly  from  the  data.  When  environmental  information  is  available, 
however,  the  information  can  be  included  in  the  CEB-DFE  framework  easily 
which  can  increase  performance  dramatically. 


7.2  Future  directions 

This  work  suggests  several  directions  which  need  further  study.  The  first  direction  is  to 
identify  a  method  that  is  more  effective  at  estimating  channel  state  information  when 
little  is  known  about  the  channel  except  the  time  and  delay  spread  of  the  channel. 
This  includes  applying  adaptive  exponential  weighting  parameter  techniques  where 
the  exponential  weighting  factor  is  another  parameter  of  the  problem.  There  has  been 
some  work  on  this  in  the  literature,  but  the  techniques  are  still  crude  and  there  is 
still  not  a  good  formulation  of  the  problem. 
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In  this  work,  the  equalizer  coefficients  were  treated  as  the  system  being  estimated 
(direct  adaptation  techniques).  Much  of  the  literature  has  focused  on  CEB  tech¬ 
niques  due  to  the  improved  performance  at  high  SNR.  CEB  techniques  require  a 
reliable  channel  model  and  accurate  channel  assumptions  to  be  effective.  This  thesis 
suggests  that  DA  techniques  are  valuable,  especially  in  lower  SNR  ranges  due  to  their 
lower  computational  complexity  and  lack  of  channel  assumptions  in  the  formulation. 
Further  work  is  needed  to  improve  DA  equalizers,  especially  in  data  limited  environ¬ 
ments;  adapting  sparse  or  random  matrix  techniques  for  the  DA  equalizer  useful. 

Introducing  channel  knowledge  into  the  DA  equalizer  to  create  a  hybrid  approach 
between  the  CEB  and  DA  equalizers  would  allow  for  a  trade-off  between  performance 
and  complexity.  When  the  channel  state  information  is  good,  knowledge  of  the  chan¬ 
nel  reduces  error.  If  some  channel  parameter  is  known,  such  as  the  number  of  channel 
coefficients  or  the  sparsity  measure  of  the  channel,  error  could  be  reduced  for  the  DA 
equalizer.  Further  study  is  needed  to  determine  if  the  decrease  in  error  is  enough  to 
justify  the  increase  in  computational  complexity. 

A  alternative  to  recovering  the  channel  state  information  after  using  a  set  model 
order  would  be  an  algorithm  which  adaptively  selects  the  appropriate  model  order. 
The  structure  of  a  universal  prediction  filter  would  be  an  appropriate  start.  This 
structure  runs  multiple  filter  lengths  simultaneously  and  chooses  the  model  order  (or 
combination  of  model  orders)  that  produces  the  lowest  MSE.  This  usually  requires  a 
lattice  filter  so  stability  issues  must  be  considered  carefully. 

To  continue  the  work  on  beamforming  methods,  a  method  for  including  the  time- 
variability  of  the  channel  explicitly  into  the  optimization  problem,  the  adaptation 
techniques,  and  the  angle  of  arrival  estimation  will  greatly  improve  all  of  these  meth¬ 
ods.  For  the  underwater  environment,  this  is  an  especially  hard  problem  due  to  the 
plethora  channel  types  and  causes  of  time-variability  that  are  observed  in  the  ocean. 

None  of  the  results  in  this  thesis  make  any  assumptions  that  the  channel  is  sparse, 
even  though  we  know  the  UWA  communications  channel  often  is.  Much  work  has 
been  done  in  the  area  of  exploiting  this  sparseness,  so  combining  the  work  from  this 
thesis  with  work  from  the  literature  would  help  generalize  the  results. 
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One  other  large  open  problem  not  directly  addressed  in  this  thesis  is  how  to 
determine  which  paths  and  channel  coefficients  are  useful  to  track.  In  order  to  make 
this  determination,  a  model  which  includes  the  estimation  error  as  a  function  of  the 
correlation  time  of  environmental  parameters  must  be  created.  Thus  far,  models  to 
include  this  information  are  very  crude  and  analytical  results  are  only  available  for 
the  simplest  models  using  base  assumptions.  Access  to  these  models  would  allow 
better  optimization  criterion  to  be  formed  which  would  lead  to  equalizers  with  better 
performance. 

The  algorithms  proposed  in  this  thesis  reduce  computation  and  improve  perfor¬ 
mance.  The  improved  performance  at  a  low  SNR  could  be  used  to  transmit  data 
at  or  below  the  noise  floor  (especially  for  the  array  processing  techniques)  for  covert 
communication.  More  research  is  needed  to  apply  these  advances  to  improve  com¬ 
munication  systems  by  reducing  overall  power,  increasing  the  data  rate  (for  a  given 
SNR),  or  both. 
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communication  channel  is  fraught  with  difficulties  including  limited  available  bandwidth,  long  delay-spread,  time-variability,  and 
Doppler  spreading.  These  difficulties  reduce  the  reliability  of  the  communication  system  and  make  high  data-rate  communication 
challenging.  Adaptive  decision  feedback  equalization  is  a  common  method  to  compensate  for  distortions  introduced  by  the 
underwater  acoustic  channel.  Limited  work  has  been  done  thus  far  to  introduce  the  physics  of  the  underwater  channel  into 
improving  and  better  understanding  the  operation  of  a  decision  feedback  equalizer.  This  thesis  examines  how  to  use  physical 
models  to  improve  the  reliability  and  reduce  the  computational  complexity  of  the  decision  feedback  equalizer.  The  specific  topics 
covered  by  this  work  are:  how  to  handle  channel  estimation  errors  for  the  time  varying  channel,  how  to  use  angular  constraints 
imposed  by  the  environment  into  an  array  receiver,  what  happens  when  there  is  a  mismatch  between  the  true  channel  order  and 
the  estimated  channel  order,  and  why  there  is  a  performance  difference  between  the  direct  adaptation  and  channel  estimation 
based  methods  for  computing  the  equalizer  coefficients. 
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